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FIELD OF THE INVENTION 



The present invention relates to nucleic acids encoding SID® 
polypeptides which bind selectively to a polypeptide encoded by a 
5 pathogenic strain of the hepatitis C virus, as well as to the SID® 
polypeptides which are encoded by said nucleic acids. 

The invention also concerns vectors comprising a nucleic acid 
encoding a SID® polypeptide as well as host cells transformed with such 
vectors. 

10 The invention is also directed to two-hybrid methods which 

make use of the nucleic acids encoding a SID® polypeptide selected 
from a pathogenic strain of the hepatitis C virus as well as to methods for 
selecting molecules which inhibit the binding between a SID® 
polypeptide and a polypeptide which specifically binds thereto. 

15 The invention also pertains to marker compounds containing a 

SID® polypeptide * as well as nucleic acids encoding such marker 
compounds and methods and kits using the same. 

BACKGROUND OF THE INVENTION 

20 

The hepatitis C virus (HCV) causes several liver diseases, 
including liver cancer. The HCV genome is a plus-stranded RNA that 
encodes the single polyprotein processed into at least 10 mature 
polypeptides. 

25 The structural proteins are located in the amino terminal quarter 

of the polyprotein, and the non-structural (NS) polypeptides in the 
remainder (for a review, see HOUGHTON, 1996). The genome 
organisation resembles that of flaviviruses and pestiviruses and HCV is 
now considered to be a member of the flaviviridae family. 
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The gene products of HCV are, from the N-terminus to the C- 
terrninus: core (p22), E1 (gp35), E2 (gp70), NS2(p21), NS3 (p70), NS4a 
(p4), NS4b(p27), NS5a (p58), NS5b (p66), as disclosed in figure 1. Core, 
E1 and E2 are the structural proteins of the virus processed by the host 
5 signal peptidase(s). The core protein and the genomic RNA constitute 
the internal viral core and E1 and E2 together with lipid membrane 
constitute the viral envelop (DUBUISSON et al., 1994; GRAKOUI et al., 
1993; HIGIKATA et al. , 1993.). 

The NS proteins are processed by the viral protein NS3 which 

10 has two functional domains: one (Cro-1) ( encompassing the NS2 region 
and the N-terminal portion of NS3, which cleaves autocatalytically 
between NS2 and NS3, and the other (Cro-2), located solely in the N- 
terminal portion of NS3, cleaves the other sites downstream NS3 
(BARTENSCHLAGER et al; 1995; HIGIKATA et al;, 1993). 

15 Various HCV protein-protein interactions have already been 

identified, notably by two hybrid methods. Noticeably, FLAJOLET et al; 
(2000) have shown interactions between NS3 and NS4A proteins as well 
as between NS4A and NS2 proteins. These authors have also shown 
core-core, NS3-E2, NS5A-E1, NS4A-NS3 and NS4A-NS2 interactions. 

20 Covalent as well as non-covalent interactions between E1 and E2 have 
been shown by PATEL et al; (1999). The protein interactions between 
NS3 and the HCV RNA helicase have also been described (MIN et al; 
1999; GALLINARI et al., 1999) as well as interaction between NS3 and 
NS4A (URBANI et al. , 1999; Dl MARCO et al., 2000; BUTKIEWICZ et 

25 al. , 2000). 

However, the prior art methods allow the determination of 
interactions between full length proteins or large domains of proteins 
encoded by the genome of the hepatitis C virus which may contain more 
than one region of interaction with one or several HCV proteins. 

30 BUTKIEWICZ et al. (2000) discloses the interaction between the NS3 
protease and a small peptide derived from NS4A. However, 
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; BUTKIEWICZ et al. (2000) discloses exclusively in vitro assays for 

f interactions between the small peptides derived from NS4A and the NS3 

r 

i protease from HCV which may not be of physiological relevance. 

| There is a need in the art for polypeptides that contain the 

] 5 minimal aminoacid sequence that is able to bind specifically with a 

I naturally-occurring HCV protein in physiological conditions in order to 

design new tools for therapeutic and detection purposes related to HCV. 

SUMMARY OF THE INVENTION 

10 

1 This invention provides nucleic acids encoding polypeptides, 

which are termed SID® polypeptides, wherein these polypeptides are the 
! final products of a double selection method involving a first step of 

| selection of HCV-derived polynucleotides through a two-hybrid system 

is and a second selection step involving an alignment between the different 
! polynucleotides selected at the first step. 

j The invention also pertains to the SID® polypeptides encoded 

i by the SID® nucleic acids. 

Another object of the invention are recombinant vectors 
I 20 containing a SID® nucleic acid as defined above as well as host cells 

transformed with such vectors or nucleic acids. 
( A further object of the invention consists of two-hybrid methods 

which make use of these SID® nucleic acids as well as to methods for 
\ selecting molecules which inhibit the binding between a SID® 

25 polypeptide and a polypeptide that binds specifically thereto, as well as 

kits for performing these methods. 
? It is still a further object of the invention to provide for marker 

compounds which comprise a SID® polypeptide or which are encoded 

by a polynucleotide containing a SID® nucleic acid as defined above, as 

<\ 

i 

V 
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well as to methods and kits which make use of these marker 
compounds. 

This invention also relates to pharmaceutical compositions as 
well as to methods for preventing or curing a HCV viral infection in a 
5 human or an animal that use a SID® polypeptide or a SID® nucleic acid 
as disclosed herein. 

Throughout this application, various publications, patents and 
published patent applications are cited. The disclosures of these 
publications, patents and published patent specifications, referenced in 
10 this application are hereby incorporated by reference into the present 
disclosure to more fully describe the state of the art to which this 
invention pertains. 

BRIEF DESCRIPTION OF THE FIGURES. 

15 

Figure 1 consists of a general overview of HCV genome and its 
encoded polyprotein. The RNA coding strand is represented with a line 
for untranslated regions (NCR) and boxes for coding regions. 

Positions and enzymes responsible for cleavage are indicated 
20 above. p7 is a secondary cleavage product of E2 (adapted from 
HOUGHTON, 1996). 

Fig. 2 is a restriction map of the plasmid pAS2AA which may be 
used for producing a recombinant" Selected Interacting Domain (SID®) " 
polypeptide or a recombinant marker compound of the invention, 
25 Fig. 3 is a restriction map of the plasmid pACTII which may be 

used for producing a recombinant " Selected Interacting Domain 
(SID®) 

Fig. 4 is a restriction map of the plasmid pUT18 which may be 
used for producing a recombinant " Selected Interacting Domain 
30 (SID®) ". 
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Fig. 5 is a restriction map of the plasmid pUT18C which may be 
used for producing a recombinant " Selected Interacting Domain 
(SID®) M . 

Fig. 6 is a restriction map of the plasmid pT25 which may be 
5 used for producing a recombinant " Selected Interacting Domain 
(SID®) 

Fig. 7 is a restriction map of the plasmid pKT25 which may be 
used for producing a recombinant " Selected Interacting Domain 
(SID®) 

10 Fig. 8 is an illustration of the first step of selecting a SID® 

nucleic acid of the invention, wherein it is performed a selection of 
different sets of overlapping nucleic acids primarily selected through a 
two-hybrid method, in order to define pre-SID nucleic acids. Three 
fragments frgl, frg2 and frg3 of lengths 11, 12 and 13 respectively. 

15 Fragment 11 and 12 are clustered together if the length of intersection, l f 
is greater than 30% of 11 and 12. Fragment frg3 is grouped with 
fragments frg1 and frg2 if the length of intersection between frg1 and 
frg3, l\ is greater than 30% of 11 and 13 and if the length of intersection 
between frg 2 and frg 3, I », is greater than 30% of 12 and 13. 

20 Fig. 9 illustrates the selection of pre-SID® nucleic acid from a 

particular set of overlapping nucleic acids previously selected through a 
two-hybrid method. The pre-SID® is defined as the intersection of all the 
fragments (frg 1-6) in a cluster. 

Fig. 10 illustrates the selection of a SID® nucleic acid from the 

25 overlapping regions between two pre-SID nucleic acids. A SID® is 
defined if the length of overlap between two pre-SID®s, I, is greater than 
30 bp. Further SID®s are defined by non-overlapping areas if their length 
(Y) represents more than 30% of the length of one of the fragments 
which contributes to the corresponding pre-SID® (frg1-6). 

30 Fig. 11 illustrates a further step of determining SID® nucleic 

acids after alignment of two overlapping SID nucleic acids identified 
according to figure 10. Fragments frgV and frg2' contribute to both 
SID®1 and SID®2 (top panel). For each SID®, the number of fragments 
are counted and fragments are assigned to the SID® with the most 

35 fragments. The remaining fragments are re-analysed and a new SID® is 
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defined as the region of intersection of these fragments (bottom panel, 
SID®2 1 - fragment 3' and fragment 4\ 

Fig. 12 illustrates a map of the vector pB5 which may be used in 
example 1 . 

5 Fig. 13 illustrates a map of the vector pP6 which may be used in 

example 1 . 

DETAILED DESCRIPTION OF THE INVENTION 

io The present invention firstly provides for nucleic acids encoding 

SID® polypeptides. 

As generally used herein, a « bait » nucleic acid encodes a 
« bait » polypeptide. A polypeptide is termed a « bait » polypeptide when 

is this polypeptide is used to select a formerly unknown « prey » nucleic 
acid encoding a « prey » polypeptide which binds selectively with said 
« bait » polypeptide. Indeed, a « prey » nucleic acid which has been 
selected for binding to a given bait polypeptide may be used in another 
selection method or in another round of the same selection method as a 

20 « bait » nucleic acid encoding a « bait » polypeptide for the purpose of 
selection of new prey nucleic acids, encoding prey polypeptides which 
bind selectively with said bait polypeptide, it being understood that the 
nucleic acid encoding said bait polypeptide was formerly selected from a 
population of prey nucleic acids. 

25 

SELECTED INTERACTING DOMAIN (SID®) POLYPEPTIDES AND 
METHODS FOR THEIR PREPARATION. 

A selected interacting domain polypeptide that binds specifically 
30 to a polypeptide of interest is the result of a two-step screening 
procedure, wherein : 
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1) the first step consists of selecting and characterizing a 
collection of nucleic acids (prey nucleic acids) encoding polypeptides 
which bind specifically to a given bait polypeptide of interest; 
and 

5 2) the second step of the two-step procedure consists of 

determining the nucleic acid sequences which encode for SID® 
polypeptides after having generated sets of polynucleotides from the 
collection of nucleic acids selected at step 1). 

As a result of the original two-step screening procedure 

10 disclosed hereunder, every nucleic acid finally selected encodes a 
« Selected Interacting Domain (SID®) " polypeptide which binds with a 
high specificity with the bait polypeptide of interest. 

Step 1) Selecting prey nucleic acids 

15 

The first step of selecting a collection of nucleic acids encoding 
polypeptides which binds specifically to the bait polypeptide is carried out 
through a yeast two-hybrid system. The yeast two-hybrid system is 
designed to study protein-protein interactions in vivo, and relies upon the 
20 fusion of a bait protein to the DNA binding domain of the yeast Gal4 
protein. 

According to the present invention, the first step of the 
procedure for selecting a Selected Interacting Domain (SID®) 
polynucleotide encoding a Selected Interacting Domain (SID®) 

25 polypeptide consists of the two-hybrid screening system described by 
Fromont-Racine et ah (1997) or the method described by FLAJOLET et 
a!. (2000). The yeast two-hybrid system utilizes hybrid proteins to detect 
protein-protein interactions by means of direct activation of a reporter 
gene expression. In essence, the nucleic acids encoding the two putative 

so protein partners, the bait polypeptide of interest and the prey 
polypeptide, are genetically fused to the DNA-binding domain of a 
transcription factor and to a transcriptional activation domain, 
respectively. 
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Construction of the prey HCV nucleic acids library. 

Then, a genomic DNA library prepared from the genome of the 
5 pathogenic H77 strain of HCV (Yanagi et al M 1997), is constructed in the 
specially designed vector pP6 shown in figure 13 after ligation to suitable 
linkers, such that every genomic DNA insert is fused to a nucleotide 
sequence in the vector that encodes the transcription of domain of the 
Gal4 protein. 

io The polypeptides encoded by the nucleotide inserts of the 

genomic DNA library thus prepared are termed " prey " polypeptides in 
the context of the presently described selection method of prey nucleic 
acids. 

15 Construction of the bait nucleic acids library 

The DNA fragments obtained after nebulization of the HCV 
genomic DNA are also inserted in plasmid pB5 shown in figure 12 
wherein these DNA inserts are fused to a polynucleotide encoding the 

20 DNA binding domain of the Gal4 protein and the recombinant vectors are 
used to transform E. coli cells. The transformed E. coli cells are grown 
and plasmid DNA is extracted and sequenced. 

These plasmids which code in frame fusion proteins are used 
as bait plasmids. Bait plasmids thus consist of a collection of 

25 recombinant pB5 plasmids each containing inserted therein a DNA 
fragment from the H77 strain HCV genome encoding a polypeptide 
consisting of all or part of a HCV protein or alternatively a polypeptide 
consisting of all or part of two HCV proteins encoded by contiguous 
nucleic acid sequences of the HCV genome. 

30 The selected HCV bait nucleic acids of the invention are 

referred to as the nucleotide sequences SEQ ID N°1 14 to 150. 

The selected HCV bait polypeptides encoded by the nucleic 
sequences SEQ ID N°114 to 150 consist respectively of the aminoacid 
sequences SEQ ID N°77 to 113. 
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Detectable marker genes are already present within the 
chromosomic yeast DNA and consist respectively of the His3 and LacZ 
genes, such as described by FROMONT-RACINE et al. (1997) or 
FLAJOLET et al. (2000). 

Then, the collection of nucleic acid inserts contained in the 
collection of E. Coli cell clones containing the genomic DNA or HCV DNA 
library previously prepared are used to transform a first yeast strain, 
namely the Y187 Saccharomyces cerevisiae strain (phenotype:MATa, 
Gal4A, gal80A, ade2-101, His3, Leu2-3, -112 Trp1-901, Ura3-52, 
URA3::UASGAL1-LacZ Met). 

The nucleic acid encoding the bait polypeptide of interest is 
inserted in the appropriate vector, said vector being used to transform a 
second yeast strain which may be the CG1945 (MATa Gal4-542 Gal 180- 
538, Ade2-101, His3*200, Leu2-3 ( -112 Trp1-901 Ura3-52 t Lys2-801, 
URA3::GAL4 17Mers (X3)-CyC1TATA-LacZ LYS2::GAL1 UAS- 
GAL1 TATA-His3 CYH R ). 

Then , the two yeast strains are mated to obtain a collection of 
mated cells. 

The clones derived from the collection of mated cells above 
which are positive in an X-Gal overlay assay are those for which an 
interaction between the recombinant bait polypeptide and a polypeptide 
encoded by a nucleic acid insert originating from the HCV genomic 
library has occurred. 

The clones derived from the collection of mated cells above 
may also be selected in the presence of histidine, and the positive clones 
are those for which an interaction between the recombinant bait 
polypeptide and a polypeptide encoded by a nucleic acid insert 
originating from the HCV genomic library has occurred. 

In a further step, the prey nucleic acid inserts contained in the 
positively selected clones are amplified and sequenced. 

Step 2:determination of the nucleic acid sequences encoding a 
Selected Interacting Domain (SID®) polypeptide which binds 
specifically to a bait polypeptide of interest. 





This is the second step of the two step procedure defined 
above, which allows the precise selection of nucleic acids encoding the 
SID® nucleic acids of the present invention which are derived from the 
H77 strain HCV genome. 

5 The SID® nucleic acid selection procedure, which is disclosed 

hereunder, has been specifically designed for the HCV genome which 
encodes for a single polyprotein and which thus comprises contiguous 
Open Reading Frames, said polyprotein being further processed to 
produce at least 10 mature structural and non-structural viral proteins. 

10 Thus, the second selection step of the two-step procedure 

consists of a method for determining a polynucleotide encoding a 
Selected Interacting Domain (SID®) of a prey polypeptide of interest 
derived from HCV, which prey polypeptide interacts with a bait 
polypeptide, wherein said method comprises the steps of : 

15 a) selecting, from the collection of prey polynucleotides 

obtained at the end of the first step of the two-step procedure described 
herein, all prey polynucleotides encoding a prey polypeptide capable of 
interacting with said bait polypeptide and containing a common nucleic 
acid fragment; 

20 b) aligning the nucleotide sequences of the prey 

polynucleotides selected at step a) and gathering in one set or in a 
plurality of sets of sequences those nucleotide sequences which have 
sequences that overlap for more than 30% of their respective nucleic 
acid length, wherein each common overlapping nucleotide sequence in 

25 one set of sequences defines a sequence encoding a pre-SID® 
polypeptide (see Figures 8 and 9); and 

c) aligning two sequences encoding two respective pre-SID® 
polypeptides (see Figure 10), and : 

i) defining an overlapping nucleic acid sequence between the 

30 sequences encoding the two respective pre-SID® polypeptides as a 
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sequence encoding a SID® polypeptide, provided that the overlapping 
sequence is of at least 30 nucleotides in length; 

ii) defining a non-overlapping nucleic acid sequence between 
the sequences encoding the two respective pre-SID® polypeptides as a 

5 sequence encoding a SID® polypeptide, provided that (1) said non- 
overlapping sequence has more than 30 nucleotides in length and (2) 
said non-overlapping sequence represents at least 30% in length of any 
one of the polynucleotides contained in the set of prey polynucleotides 
used for defining the sequence encoding each pre-SID® polypeptide. 

10 This method may further comprise the steps of : 

d) counting the number of overlapping prey polynucleotides 
contained in a first set of polynucleotides defining a sequence encoding 
a first SID® polypeptide; 

e) counting the number of overlapping prey polynucleotides 
is contained in a second set of polynucleotides defining a sequence 

encoding a second SID® polypeptide which overlaps with the sequence 
encoding the first SID® polypeptide; 

f) determining which sequence among those encoding 
respectively the first SID® polypeptide and the second SID® 

20 polypeptide has been defined with the largest number of prey 
polynucleotides and selecting this set of prey sequences. 

g) adding to the set of prey sequences selected at step f) those 
sequences that were contained in the set of prey sequences used for 
defining the sequence encoding the SID® polypeptide with the smallest 

25 number of prey sequences and which overlap with the sequence 
encoding the SID® polypeptide with the largest number of prey 
sequences.; 

h) aligning the prey sequences added at step g) with the 
sequences already contained in the set of prey sequences which defined 

so the sequence encoding the SID® polypeptide with the largest number of 
prey sequences; 
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i) defining an overlapping sequence between the whole 
sequences which were aligned in step h), wherein said overlapping 
sequence consists of a sequence encoding a SID® polypeptide. (See 
Figure 11). 

The method for selecting a SID® nucleic acid encoding a SID® 
polypeptide is an object of the present invention, as well as any SID® 
nucleic acid or any SID® polypeptide which may be obtained by this 
selection method. 

SID® nucleic acids of the invention 

The SID® nucleic acids selected as described above starting 
from the genome of the H77 strain of HCV are the nucleic acid 
sequences of SEQ ID N°39 to 76 which encode the SID® polypeptides 
of SEQ ID N°l to 38. 

A first object of the invention consists of a nucleic acid which 
encodes a polypeptide selected from the group consisting of the 
aminoacid sequences SEQ ID N°1 to 38 or a variant thereof, and a 
sequence complementary thereto. 

For the purposes of the present invention, a first polynucleotide 
is considered as being « complementary » to a second polynucleotide 
when each base of the first polynucleotide is paired with the 
complementary base of the second polynucleotide whose orientation is 
reversed. The complementary bases are A and T(or A and U), or C and 
G. 

Preferably, any one of the nucleic acid or the polypeptides 
encompassed by the invention is under a purified or an isolated form. 

The term "isolated" for the purposes of the present invention 
designates a biological material (nucleic acid or protein) which has been 
removed from its original environment (the environment in which it is 
naturally present). 

DA. 
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For example, a polynucleotide present in the natural state in a 
plant or an animal is not isolated. The same polynucleotide separated 
from the adjacent nucleic acids in which it is naturally inserted in the 
genome of the plant or animal is considered as being "isolated". 
5 Such a polynucleotide may be included in a vector and/or such 

a polynucleotide may be included in a composition and remains 
nevertheless in the isolated state because of the fact that the vector or 
the composition does not constitute its natural environment. 

The term "purified" does not require the material to be present 

10 in a form exhibiting absolute purity, exclusive of the presence of other 
compounds. It is rather a relative definition. 

A polynucleotide is in the "purified" state after purification of the 
starting material or of the natural material by at least one order of 
magnitude, preferably 2 or 3 and preferably 4 or 5 orders of magnitude. 

is "Isolated polypeptide" or "isolated protein" is a polypeptide or 

protein which is substantially free of those compounds that are normally 
associated therewith in its natural state (e.g., other proteins or 
polypeptides, nucleic acids, carbohydrates, lipids). "Isolated" is not 
meant to exclude artificial or synthetic mixtures with other compounds, or 

20 the presence of impurities which do not interfere with biological activity, 
and which may be present, for example, due to incomplete purification, 
addition of stabilisers, or compounding into a pharmaceutical^ 
acceptable preparation. 

25 Variants of a selected interacting domain (SID®) polypeptide and 
nucleic acids encoding them. 

As intended herein, a variant of a Selected Interacting Domain 
(SID®) polypeptide may be either a variant polypeptide of the Selected 
30 Interacting Domain (SID®) polypeptide or a polypeptide which is 
encoded by a nucleic acid variant of the polynucleotide encoding said 
Selected Interacting Domain (SID®) polypeptide. 

Polynucleotides which encode a polypeptide variant of a 
Selected Interacting Domain (SID®) polypeptide, as the term is used 
35 herein, are polynucleotides that differ from the reference polynucleotide 
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encoding the parent SID® polypeptide. A variant of a polynucleotide may 
be a naturally occurring variant such as a naturally occurring allelic 
variant, or it may be a variant that is not known to occur naturally. Such 
non-naturally occurring variants of the reference polynucleotide may be 
generated by mutagenesis techniques, including those applied to 
polynucleotides, cells or organisms well known to one skilled in the art. 

Generally, differences are limited so that the nucleotide 
sequences of the reference and the variant are closely similar overall 
and, in many regions, identical. 

Variants of polynucleotides according to the invention include, 
without being limited to, nucleotide sequences which are at least 95% 
identical after optimal alignment to the reference polynucleotide of SEQ 
ID N°39 to 76 encoding the reference Selected Interacting Domain 
(SID®) polypeptide, preferably at least 96%, 97%, 98% and most 
preferably at least 99% identical to the reference polynucleotide. 
Similarly, a variant of a SID® polypeptide of the invention consists of a 
polypeptide having at least 95% aminoacid identity with a polypeptide 
selected from the aminoacid sequences SEQ ID N°1 to 38, and 
preferably at least 96%, 97%, 98% and most preferably at least 99% 
aminoacid identity with one of SEQ ID N°1 to 38. 

Identity refers to sequence identity between two peptides or 
between two nucleic acid molecules. Identity between sequences can be 
determined by comparing a position in each of the sequences which may 
be aligned for purposes of comparison. When a position in the compared 
sequences is occupied by the same base or amino acid, then the 
sequences are identical at that position. A degree of identity between 
nucleic acid sequences is a function of the number of identical 
nucleotides at positions shared by these sequences. A degree of identity 
between amino acid sequences is a function of the number of identical 
aminoacids at positions shared by these sequences. Since two 
polynucleotides may each (1) comprise a sequence (i.e., a portion of the 
complete polynucleotide sequence) that is similar between the two 
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polynucleotides, and (2) may further comprise a sequence that is 
divergent between the two polynucleotides, sequence comparisons 
between two (or more) polynucleotides are typically performed by 
comparing sequences of the two polynucleotides over a " comparison 
window " to identify and compare local regions of sequence similarity. A 
" comparison window as used herein, refers to a conceptual segment 
of at least 20 contiguous nucleotide positions wherein a polynucleotide 
sequence may be compared to a reference sequence of at least 20 
contiguous nucleotides and wherein the portion of the polynucleotide 
sequence in the comparison window may comprise additions or deletions 
(i.e., gaps) of 20 percent or less as compared to the reference sequence 
(which does not comprise additions or deletions) for optimal alignment of 
the two sequences. Optimal alignment of sequences for determining a 
comparison window may be conducted by the local homology algorithm 
of Smith and Waterman (1981), by the homology alignment algorithm of 
Needleman and Wunsch (1972), by the search for similarity method of 
Pearson and Lipman (1988), by computerized implementations of these 
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin 
Genetics Solftware Package Release 7.0, Genetics Computer Group, 
575, Science Dr. Madison, W1), or by inspection. The best alignment 
(i.e., resulting in the highest percentage of identity over the comparison 
window) generated by the various methods is selected. The term 
" sequence identity " means that two polynucleotide sequences are 
identical (i.e., on a nucleotide-by-nucleotide basis) over the window of 
comparison. The term " percentage of sequence identity " is calculated 
by comparing two optimally aligned sequences over the window of 
comparison, determining the number of positions at which the identical 
nucleic acid base (e.g. A, T, C, G, U or I) occurs in both sequences to 
yield the number of matched positions, dividing the number of matched 
positions by the total number of positions in the window of comparison 
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(i.e., the window size), and multiplying the result by 100 to yield the 
percentage of sequence identity. 

Most preferably, the percentage of nucleic acid or aminoacid 
identity between two nucleic acid or aminoacid sequences is calculated 
5 using the BLAST software (Version 2.06 of September 1998) with the 
default parameters. 

Nucleotide changes present in a variant polynucleotide may be 
silent, which means that they do not alter the aminoacid encoded by the 
reference polynucleotide. 
io However, nucleotide changes may also result in aminoacid 

substitutions, additions, deletions, fusions and truncations in the 
Selected Interacting Domain (SD®) polypeptide encoded by the 
reference sequence. 

The substitutions, deletions or additions may involve one or 
is more nucleotides. Alterations may produce conservative or non- 
conservative aminoacid substitutions, deletions or additions. 

Most preferably, the variant of a Selected Interacting Domain 
(SID®) polypeptide encoded by a variant polynucleotide possesses at 
least the same affinity of binding to its protein or polypeptide counterpart, 
20 against which it has been initially selected as described above. 

The affinity of a given SID® polypeptide of the invention for a 
polypeptide into which it specifically binds is defined as the affinity 
constant Ka t wherein 



25 ' 

[SID®/polypeptide complex] 
Ka = 



[free SID®] [free polypeptide] 



30 with [free SID®], [free polypeptide] and [SID®/polypeptide complex ] 
consist of the concentrations at equilibrium respectively of the free SID® 
polypeptide, of the free polypeptide onto which the SID® polypeptide 
specifically binds and of the complex formed between the SID® 
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polypeptide and the polypeptide onto which said SID® polypeptide 
specifically binds. 

Most preferably, the affinity of a SID® polypeptide of the 
invention or a variant thereof for its polypeptide counterpart (polypeptide 

5 partner) is assessed on a Biacore™ apparatus marketed by Amercham 
Pharmacia Biotech Company such as described by SZABO et al. (1995) 
and by Edwards and Leartherbarrow (1997). 

As used herein, the expression « at least the same affinity» with 
reference to the affinity of binding between a SID® polypeptide of the 

10 invention to another polypeptide means that the Ka is identical or is of at 
least two-fold, preferably at least three-fold and most preferably at least 
five-fold greater than the Ka value of reference. 

In another preferred embodiment, the variant of a Selected 
Interacting Domain (SID®) polypeptide which is encoded by a variant 

15 polynucleotide of the invention possesses a higher specificity of binding 
to its counterpart polypeptide or protein than the reference Selected 
Interacting Domain (SID®) polypeptide. 

A variant of a Selected Interacting Domain (SID®) polypeptide 
according to the invention may be (1) one in which one or more, most 

20 preferably from one to three, of the aminoacid residues are substituted 
with a conserved or a non-conserved aminoacid residue and such 
substituted aminoacid residue may or may not be one encoded by the 
genetic code, or (2) one in which one or more of the aminoacid residues 
includes a substituent group. 

25 In the case of an aminoacid substitution in the aminoacid 

sequence of a Selected Interacting Domain (SID®) polypeptide 
according to the invention, one or several-consecutive or non- 
consecutive - aminoacids are replaced by " equivalent " aminoacids. The 
expression " equivalent " aminoacid is used herein to designate any 

30 aminoacid that may be substituted for one of the aminoacids belonging 
to the native Selected Interacting Domain (SID®) polypeptide structure 
without decreasing the binding properties of the corresponding peptides 
to their counterpart polypeptide or protein, as regards the reference 
Selected Interacting Domain (SID®) polypeptide. 
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These equivalent aminoacids may be determined either by their 
structural homology with the initial aminoacids to be replaced, by the 
similarity of their net charge or of their hydrophobicity. 

By an equivalent aminoacid according to the present invention 

5 is also meant the replacement of a residue in the L-form by a residue in 
the D-form or the replacement of a glutamic acid residue by a pyro- 
glutamic acid compound. The synthesis of peptides containing at least 
one residue in the D-form is, for example, described by KOCH (1977). A 
specific embodiment of a variant of a Selected Interacting Domain 

io (SID®) polypeptide according to the invention includes, but is not limited 
to, a peptide molecule which is resistant to proteolysis, such as a peptide 
in which the -CONH- peptide bond is modified and replaced by a (- 
CH 2 NH-) reduced bond, a (-NHCO-) retroinverso bond, a (-CH 2 -0-) 
methylene-oxy bond, a (-CH 2 -S-) thiomethylene bond, a (-CH 2 CH 2 -) 

15 carba bond, a (-CO-CH 2 ) hydroxyethylene bond, a (-N-N-) bond or also a 
-CH=CH bond. 

As used herein, a variant of a SID® polypeptide of the invention 
also encompasses a polypeptide having an aminoacid sequence 
consisting of at least: 
20 - 45 consecutive aminoacids of SEQ ID N°1; 

- 30 consecutive aminoacidss of SEQ ID N°2; 

- 65 consecutive aminoacids of SEQ ID N°3; 

- 30 consecutive aminoacids of SEQ ID N°4; 

- 130 consecutive aminoacids of SEQ ID N°5; 
25 - 25 consecutive aminoacids of SEQ ID N°6; 

- 23 consecutive aminoacids of SEQ ID N°7. 

- 48 consecutive aminoacids of SEQ ID N°8; 

- 36 consecutive aminoacids of SEQ ID N°9; 

- 25 consecutive aminoacids of SEQ ID N°10; 
30 - 24 consecutive aminoacids of SEQ ID N°1 1 ; 

- 37 consecutive aminoacids of SEQ ID N°12; 

- 25 consecutive aminoacids of SEQ ID N°13; 

- 30 consecutive aminoacids of SEQ ID N°14; 

- 27 consecutive aminoacids of SEQ ID N°15; 
35 - 69 consecutive aminoacids of SEQ ID N°16; 
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130 consecutive aminoacids of SEQ ID N°17; 

33 consecutive aminoacids of SEQ ID N°18 

25 consecutive aminoacids of SEQ ID N°19 
40 consecutive aminoacids of SEQ ID N°20 
78 consecutive aminoacids of SEQ ID N°21 
39 consecutive aminoacids of SEQ ID N°22 
57 consecutive aminoacids of SEQ ID N°23 

26 consecutive aminoacids of SEQ ID N°24 
68 consecutive aminoacids of SEQ ID N°25 

34 consecutive aminoacids of SEQ ID N°26 
42 consecutive aminoacids of SEQ ID N°27 

48 consecutive aminoacids of SEQ ID N°28. 
102 consecutive aminoacids of SEQ ID N°29 

49 consecutive aminoacids of SEQ ID N°30: 
92 consecutive aminoacids of SEQ ID N° 31; 
49 consecutive aminoacids of SEQ ID N°30 
92 consecutive aminoacids of SEQ ID N°31 

■ 71 consecutive aminoacids of SEQ ID N°32 

■ 55 consecutive aminoacids of SEQ ID N°33 

- 69 consecutive aminoacids of SEQ ID N°34 

- 23 consecutive aminoacids of SEQ ID N°35 

- 33 consecutive aminoacids of SEQ ID N°36 

- 32 consecutive aminoacids of SEQ ID N°37 

- 22 consecutive aminoacids of SEQ ID N°38. 



Without wishing to be bound by any particular theory, the 

inventors believe that polypeptides having an aminoacid length of about 

10% lesser than the aminoacid length of anyone of the SID® 

30 polypeptides of SEQ ID N°1 to 39 of the invention have a high probability 

to retain the binding properties to a given (bait) polypeptide of the parent 

SID® polypeptide. 

The invention also pertains to a nucleic acid encoding a SID® 

polypeptide which is selected from the group consisting of the sequences 

35 SEQ ID N°39 to 76, and a sequence complementary thereto. 
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The invention is also directed to a nucleic acid encoding a 
variant of SID® polypeptide selected from the group consisting of the 
sequences SEQ ID N°39 to 76, in reference to the definition of the SID® 
polypeptide variants above. 

For example, a nucleic acid encoding a polypeptide having an 
aminoacid sequence consisting of at least 45 consecutive aminoacids of 
SEQ ID N°1 comprise at least 135 (45 x 3) consecutive nucleotides of 
the polynucleotide of SEQ ID N°39. 

The same definition also apply for nucleic acids encoding 
variants of the SID® polypeptides of SEQ ID N°2 to 38, which are part of 
the invention. 

The invention further relates to a nucleic acid encoding a 
polypeptide having an aminoacid sequence comprising from 1 to 3 
substitutions, additions or deletions of one aminoacid as regards a 
polypeptide selected from the group consisting of the aminoacid 
sequences SEQ ID N°1 to 38 or a sequence complementary thereto. 

Another object of the invention consists of a polypeptide 
selected from the group consisting of the aminoacid sequences SEQ ID 
N°39 to 76 or.a variant thereof. 

Are encompassed in the family of variants of a SID® 
polypeptide of the invention those polypeptides having an aminoacid 
sequence comprising from 1 to 3 substitutions, additions or deletions of 
one aminoacid as regards a polypeptide selected from the group 
consisting of the aminoacid sequences SEQ ID N°1 to 38. 

The invention is also directed to an antibody directed against a 
a SID® polypeptide as defined above, or to a variant thereof. 

The antibodies directed specifically against the Selected 
Interacting Domain (SID®) polypeptide or a variant thereof may be 
indifferently radioactively or non-radioactively labelled. 

Monoclonal antibodies directed against a SID® polypeptide 
may be prepared from hybridomas according to the technique described 
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by Kohler and Milstein in 1975. Polyclonal antibodies may be prepared 
by immunization of a mammal, especially a mouse or a rabbit, with the 
SID® polypeptide that is combined with an adjuvant of immunity, and 
then by purifying the specific antibodies contained in the serum of the 
5 immunized animal on a affinity chromatography column on which has 
previously been immobilized the polypeptide that has been used as the 
antigen. 

Antibodies directed against a SID® polypeptide may also be 
produced by the trioma technique and by the human B-cell hybridoma 

10 technique (Kozbor et al., 1983). 

Antibodies directed to a SID® polypeptide include chimeric 
single chain Fv antibody fragments (US Patent N° US 4,946,778; 
Martineau et aL, 1998), antibody fragments obtained through phage 
display libraries (Ridder et aL, 1995) and humanized antibodies 

15 (Reinmann et aL, 1997; Leger et aL, 1997). Also, transgenic mice, or 
other organisms such as other mammals, may be used to express 
antibodies, including for example, humanized antibodies directed against 
a SID® polypeptide of the invention, or a variant thereof. 

20 VECTORS OF THE INVENTION 

The nucleic acids coding for a Selected Interacting Domain 
(SID®) polypeptide or a variant thereof, which are defined in the section 
above, can be inserted into an appropriate expression vector, i.e., a 

25 vector which contains the necessary elements for the transcription and 
translation of the inserted protein-coding sequence. Such transcription 
elements include a regulatory region and a promoter as defined 
previously. Thus, the nucleic acid encoding a marker compound of the 
invention is operably linked with a promoter in a expression vector, 

30 wherein said expression vector may include a replication origin. 

The necessary transcriptional and translation of signals is most 
preferably provided by the recombinant expression vector. 

Structure of the vectors encompassed by the invention 

35 
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A wide variety of host/expression vector combinations may be 
employed in expressing the nucleic acids of this invention. Useful 
expression vectors, for example, may consist of segments of 
chromosomal, non-chromosomal and synthetic DNA sequences. 

5 Suitable vectors include derivatives of SV40 and known bacterial 
plasmids, e.g., Escherichia coli plasmids col El, pCR1, pBR322, pMal- 
C2, pET, pGEX (Smith ef a/., 1988), pMB9 and their derivatives, 
plasmids such as RP4; phage DNAs, e.g., the numerous derivatives of 
phage I, e.g., NM989, and other phage DNA, e.g., M13 and filamentous 

10 single stranded phage DNA; yeast plasmids such as the 2m plasmid or 
derivatives thereof; vectors useful in eukaryotic cells, such as vectors 
useful in insect or mammalian cells; vectors derived from combinations 
of plasmids and phage DNAs, such as plasmids that have been modified 
to employ phage DNA or other expression control sequences; and the 

is like. 

For example, in a baculovirus expression system, both non- 
fusion transfer vectors, such as but not limited to pVL941 (BamH1 
cloning site; Summers), pVL1393 (Ba/nH1, Sma\, Xbal, FcoR1, Notl % 
Xmalll, Bgr/ll, and Pst\ cloning site; Invitrogen), pVL1392 (Bgr/ll, Psfl, A/ofl, 

20 XmaNI, EcoRI, Xba\ t Smal, and BamH1 cloning site; Summers and 
Invitrogen), and pBlueBaclll (SamH1, Bg/II, Psfl, A/col, and Hind\l\ 
cloning site, with blue/white recombinant screening possible; Invitrogen), 
and fusion transfer vectors, such as but not limited to pAc700 (BamHI 
and Kpn\ cloning site, in which the Ba/nHI recognition site begins with 

25 the initiation codon; Summers), pAc701 and pAc702 (same as pAc700, 
with different reading frames), pAc360 (BamHI cloning site 36 base 
pairs downstream of a polyhedrin initiation codon; lnvitrogen(195)), and 
pBlueBacHisA, B, C (three different reading frames, with BamHI, Bg/ll, 
Pst\ % A/col, and Hind\\\ cloning site, an N-terminal peptide for ProBond 

30 purification, and blue/white recombinant screening of plaques; Invitrogen 
(220) can be used. 

Mammalian expression vectors contemplated for use in the 
invention include vectors with inducible promoters, such as the 
dihydrofolate reductase (DHFR) promoter, e.g., any expression vector 

35 with a DHFR expression vector, or a DHF/^/methotrexate co- 
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amplification vector, such as pED (Pst\, Sail, Sba\, Sma\, and EcoRl 
cloning site, with the vector expressing both the cloned gene and DHFR\ 
Kaufman, 1991). Alternatively, a giutamine synthetase/methionine 
sulfoximine co-amplification vector, such as pEE14 (HindlW, Xba\, Sma\ t 

5 Sba\, EcoRl, and Bcl\ cloning site, in which the vector expresses 
giutamine synthase and the cloned gene; Celltech). In another 
embodiment, a vector that directs episomal expression under control of 
Epstein Barr Virus (EBV) can be used, such as pREP4 (8amH1 t Sfi\ t 
Xho\ % A/of I, Nhe\, Hind\\\ t Nhe\ t Pvull, and Kpn\ cloning site, constitutive 

lo RSV-LTR promoter, hygromycin selectable marker; Invitrogen), pCEP4 
(BamHI, S/71, Xho\ % Not), Nhe\ y Hind\\\, Nhe\ t Pvull, and Kpn\ cloning 
site, constitutive hCMV immediate early gene, hygromycin selectable 
marker; Invitrogen), pMEP4 (Kpn\ t Pvu\, Nhe\, Hind\l\, Afofl, Xhol, Sffl, 
BamHI cloning site, inducible methallothionein Ha gene promoter, 

15 hygromycin selectable marker: Invitrogen), pREP8 (BamHI, Xho\ t A/ofl, 
HindUl, Nhe\, and Kpn\ cloning site, RSV-LTR promoter, histidinol 
selectable marker; Invitrogen), pREP9 (Kpnl, Nhe\, H/ndlll, Afofl, X/?ol, 
S/7I, and BamHI cloning site, RSV-LTR promoter, G418 selectable 
marker; Invitrogen), and pEBVHis (RSV-LTR promoter, hygromycin 

20 selectable marker, N-terminal peptide purifiable via ProBond resin and 
cleaved by enterokinase; Invitrogen). Selectable mammalian expression 
vectors for use in the invention include pRc/CMV (H/ncflll, Bs/XI, Nott, 
Sba\, and >Apal cloning site, G418 selection; Invitrogen), pRc/RSV 
(Hind\\\ % Spel, BstX\ % Not\, Xba\ cloning site, G418 selection; Invitrogen), 

25 and others. Vaccinia virus mammalian expression vectors (see, 
Kaufman, 1991, supra) for use according to the invention include but are 
not limited to pSC11 (Smal cloning site, TK- and b-gal selection), 
pMJ601 (Sa/I, Smal, AfH t Nari t SspMII, SamHI, >Apal, Nhe\ t Sacll, Kpnl 
and Hind\\\ cloning site; TK- and b-gal selection), and pTKgptFIS 

30 (EcoRl, Psfl, Sa/I, >Accl, Hind\\ t Sba\ % BamHI, and Hpa cloning site, TK or 
XPRT selection). 

Yeast expression systems can also be used according to the 
invention to express a Selected Interacting Domain (SID®) polypeptide 
or a variant thereof and also a marker compound as defined herein. For 

35 example, the non-fusion pYES2 vector (Xfaal, Sphi, Sho\, A/ofl, GstX\, 
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EcoRI, BsfXI, BamHI, Sad, Kpn\ y and HindlW cloning sit; Invitrogen) or 
the fusion pYESHisA, B, C (Xbal, Sphl, Sho\, A/ofl, BsfXI, EcoRI, 
Ba/7?H1, Sacl, Kpn\, and HindlW cloning site, N-terminal peptide purified 
with ProBond resin and cleaved with enterokinase; Invitrogen), to 

5 mention just two, can be employed according to the invention. 

Once a suitable host system and growth conditions are 
established, recombinant expression vectors can be propagated and 
prepared in quantity. As previously explained, the expression vectors 
which can be used include, but are not limited to, the following vectors or 

10 their derivatives: human or animal viruses such as vaccinia virus or 
adenovirus; insect viruses such as baculovirus; yeast vectors; 
bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA 
vectors, to name but a few. 

Vectors are introduced into the desired host cells by methods 

15 known in the art, e.g., transfection, electroporation, microinjection, 
transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, 
lipofection (lysosome fusion), use of a gene gun, or a DNA vector 
transporter (see, e.g., Wu et al., 1992; Wu and Wu, 1988; Canadian 
Patent Application No. 2,012,311, filed March 15, 1990). 

20 A cell has been "transfected" by exogenous or heterologous 

DNA when such DNA has been introduced inside the cell. A cell has 
been "transformed" by exogenous or heterologous DNA when the 
transfected DNA effects a phenotypic change. 

For introducing a vector in a cell host, explicit reference is made 

25 to research carried out by the group of E. Wagner, relating to gene 
delivery by means of plasmid-polylysine complexes (Curiel et al., 1991; 
and Curiel et al., 1992). The plasmid-polylysine complex investigated 
upon exposition to certain cell lines showed at least some expression of 
the gene. Further, it was found that the expression efficiency increased 

30 considerably due to the binding of transferrin to the plasmid-polylysine 
complex. Transferrin gives rise to close cell-complex contact with cells 
comprising transferrin receptors; it binds the entire complex to the 
transferrin receptor of cells. Subsequently, at least part of the entire 
complex was found to be incorporated in the cells investigated. 

35 
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Several different approaches have been developed for gene 
transfer. These include the use of viral based vectors (e.g., retroviruses, 
adenoviruses, and adeno-associated viruses) (Drumm, M. L. et al., 
Rosenfeld, M. A. et al M 1992; and Muzyczka, 1992), charge associating 
5 the DNA with an asialorosomucoid/poly L-lysine complex (Wilson, J. M. 
et al. 1992), charge associating the DNA with cationic liposomes 
(Brigham, K. L. et al., 1993) and the use of cationic liposomes in 
association with a poly-L-lysine antibody complex (Trubetskoy, V. S. et 
al., 1993). 

10 

Compositions comprising vectors of the invention. 

Although non-viral based transfection systems have not 
exhibited the efficiency of viral vectors, they have received significant 

15 attention, in both in vitro and in vivo research, because of their 
theoretical safety when compared to viral vectors. Synthetic cationic 
molecules, have been reported which reportedly "coat" the nucleic acid 
through the interaction of the cationic sites on the transfection agent and 
the anionic sites on the nucleic acid. The positively charged coating 

20 reportedly interacts with the negatively charged cell membrane to 
facilitate the passage of the nucleic acid through the cell membrane by 
non-specific endocytosis. (Schofield, 1995) These compounds have, 
however, exhibited considerable sensitivity to natural serum inhibition, 
which has probably limited their efficiency in vivo as gene transfection 

25 agents. (Behr 1994) 

A number of attempts have been made to improve the 
efficiency of lipid-like cationic transfection agents, some involving the use 
of polycationic molecules. For example, several transfection agents have 
been developed that contain the polycationic compound spermine 

30 covalently attached to a lipid carrier. (Behr, 1994), discloses a 
lipopolyamine and shows it to be more efficient at transfecting cells than 
single charge molecules (albeit still less efficient than viral vectors). The 
agent reported by Behr was, however, toxic, and caused cell death. 
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A few such lipid delivery systems for transporting DNA, proteins, 
and other chemical materials across membrane boundaries have been 
synthesized by research groups and business entities. Most of the 
synthesis schemes are relatively complex and generate lipid based 
delivery systems having only limited transfection abilities. A need exists 
in the field of gene therapy for cationic lipid species that have a high 
biopolymer transport efficiency. It has been known for some time that a 
very limited number of certain quaternary ammonium derivatized 
(cationic) liposomes spontaneously associate with DNA, fuse with cell 
membranes, and deliver the DNA into the cytoplasm (as noted above, 
these species have been termed "cytofectins"). LIPOFECTIN TM. 
represents a first generation of cationic liposome formulation 
development. LIPOFECTIN TM is composed of a 1:1 formulation of the 
quaternary ammonium containing compound DOTMA and 
dioleoylphosphatidylethanolamine sonicated into small unilamellar 
vesicles in water. Problems associated with LIPOCFECTIN TM include 
non-metabolizable ether bonds, inhibition of protein kinase C activity, 
and direct cytotoxicity. In response to these problems, a number of other 
related compounds have been developed. The monoammonium 
compounds of the subject invention improve upon the capabilities of 
existing cationic liposomes and serve as a very efficient delivery system 
for biologically active chemicals. 

Most preferred vectors of the invention. 

Most preferred recombinant vectors according to the invention 
include pASAA(figure 2), pACTllst (figure 3), pT18 (figure 4), pUT18C 
(figure 5), pT25 (figure 6), pKT25(figure 7), pB5 (Figure 12) and pP6 
(Figure 13) containing inserted therein a nucleic acid encoding a 
Selected Interacting Domain (SID®) polypeptide or a variant thereof as 
defined above. 

The present invention is also directed to a vector usable in a 
two-hybrid method which consists of the vector pP6 which is shown in 
figure 13. As disclosed in example 1, the vector pP6 has been 
successfully used for preparing a collection of recombinant plasmids 
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consisting of a genomic DNA library from the pathogenic strain H77 of 
the hepatitis C virus. 

The invention also pertains to a vector usable in two-hybrid 
method which consists of the vector pB5. As disclosed in example 1, the 
vector pB5 has been successfully used in a yeast two hybrid method as 
a bait plasmid. 

RECOMBINANT CELL HOSTS 

In one embodiment, a Selected Interacting Domain (SID®) 
polypeptide of the invention or a variant thereof is recombinantly 
produced in a desired host cell which has been transfected or 
transformed with a nucleic acid encoding said Selected Interacting 
Domain (SID®) polypeptide or with a recombinant vector as defined 
above within which a nucleic acid encoding a Selected Interacting 
Domain (SID®) polypeptide of the invention is inserted. 

Recombinant cell hosts are another aspect of the present 
invention. 

Such cell hosts generally comprise at least one copy of a 
nucleic acid encoding a Selected Interacting Domain (SID®) polypeptide 
of the invention or a variant thereof 

Preferred cells for expression purposes will be selected in 
function of the objective which is sought. For example, in the 
embodiment wherein the production of a Selected Interacting Domain 
(SiD®) polypeptide according to the invention in large quantities is 
sought, the nature of the host cell used for its production is relatively 
indifferent, provided that large amounts of Selected Interacting Domain 
(SID®) polypeptides of the invention are produced and that optional 
further purification steps may be carried out easily. 

However, in the embodiment wherein the Selected Interacting 
Domain (SID®) polypeptide is recombinantly produced within a host 
organism for the purpose of interfering with a specific protein-protein 
interaction, then the host organism is selected among the host 
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organisms which are suspected to produce naturally said polypeptide of 
interest. 

Consequently, mammalian and typically human cells, as well as 
bacterial, yeast, fungal, insect, nematode and plant cells are cell hosts 
encompassed by the invention and which may be transfected either by a 
nucleic acid or a recombinant vector as defined above. 

Examples of suitable recombinant host cells include VERO 
cells, HELA cells (e.g. ATCC N°CCL2), CHO cell-lines (e.g. ATCC 
N°CCL61) COS cells (e.g. COS-7 cells; COS cell referred to ATCC 
N°CRL1650), W138, BHK, HepG2, 3T3 (e.g. ATCC N°CRL6361), A549, 
PC12, K562 cells, 293 cells, Sf9 cells (e.g. ATCC N°CRL1711) and Cv1 
cells (e.g. ATCC N°CCL70). 

Other suitable host cells are usable according to the invention 
include prokaryotic host cells strains of Escherichia coli (e.g. strain DH5- 
a), of Bacillus subtilis, of Salmonella typhimurium, or strains of genera 
such as Pseudomonas, Streptomyces and Staphylococcus. 

Further suitable host cells usable according to the invention 
include yeast cells such as those of Saccharomyces t typically 
Saccharomyces cerevisiae. 

The invention also relates to a method for producing a SID® 
polypeptide as defined above, wherein said method comprises the steps 
of: 

a) cultivating a cell host which has been transformed with a 
SID® nucleic acid of the invention or with a vector containing a SID® 
nucleic acid in an appropriate culture medium; 

b) recovering the SID® recombinant polypeptide from the 
culture supernatant or from the cell lysate. 

The SID® polypeptides or variant thereof thus recombinantly 
obtained may be purified, for example by high performance liquid 
chromatography, such as reverse phase and/or cationic exchange 
HPLC, as described by ROUGEOT et al. (1994). The reason to prefer 
this kind of peptide or protein purification is the lack of by-products found 
in the elution samples which renders the resultant purified protein more 
suitable for a therapeutic use. 



DA. 

Le August 2 2 000 





A 



29 



TWO-HYBRID METHODS OF THE INVENTION 
a) Yeast two-hybrid methods 

5 The invention also pertains to a yeast two-hybrid method for 

selecting a recombinant cell clone containing a vector comprising a 
nucleic acid insert encoding a prey polypeptide which binds with a SID® 
polypeptide of SEQ ID N°1 to 38 or a variant thereof, wherein said 
method comprises the steps of : 

10 a) mating at least one first recombinant yeast cell clone of a 

collection of recombinant yeast cell clones transformed with a plasmid 
containing the prey polynucleotide to be assayed with a second aploTd 
recombinant Saccharomyces cerevisiae cell clone transformed with a 
plasmid containing a bait polynucleotide encoding a SID® polypeptide of 

j 5 the invention or a variant thereof; 

b) cultivating diploid cells obtained in step a) on a selective 
medium; and 

c) selecting recombinant cell clones which grow on said 
selective medium. 

20 The yeast two-hybrid method above may further comprise the 

step of : 

d) characterizing the prey polynucleotide contained in each 
recombinant cell clone selected in step c). 

Most preferably, such a yeast two-hybrid method may be 
25 performed by the one skilled in the art as it is disclosed in example 2 
hereafter. 

According to the yeast two-hybrid method above, a SID® 
polypeptide of the invention or a variant thereof is used as a bait 
polypeptide. 

30 In a preferred embodiment of the yeast two-hybrid method 

described above, the prey polynucleotide is a DNA fragment from the 
genome of a pathogenic strain of the hepatitis C virus (HCV) ranging 
from about 150 to about 600 nucleotides in length and which is inserted 
in a vector which is contained in one recombinant clone of a collection of 

35 recombinant cell clones. 
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b) Bacterial two-hybrid method 

A bacterial two-hybrid method of the invention may be 
5 performed by the one skilled in the art according to the teachings of 
KARIMOVA et al. (1998). 

The first step of selecting a collection of nucleic acids encoding 
polypeptides which binds specifically to the bait polypeptide may also be 
carried out through a bacterial two-hybrid system. 
10 According to such bacterial two-hybrid system, bacterial cell 

clones, preferably Escherichia coli cells, are transformed with a plasmid 
containing a bait polynucleotide encoding a bait polypeptide. 

Then, plasmids containing a DNA insert are provided by 
rescuing the plasmids obtained from the collection of yeast clones 
is containing the genomic DNA or cDNA library which are described in the 
previous section entitled " Yeast two-hybrid system For example, the 
plasmid rescue may be carried out according to the following steps: 

(i) extracting plasmid DNA contained in the collection of yeast 
clones obtained as disclosed in the previous section, by using a 

20 conventional DNA extraction buffer and a phenol: chloroform: isoamyl 
alcohol (25:24:1) before centrifuging; 

(ii) transferring a desired volume of the supernatant obtained at 
the end of step (i) to a sterile Eppendorf tube and add a precipitation 
buffer (ethanol/NH4Ac) before centrifuging and resuspending the pellet 

25 after washing in ethanol; 

(iii) transforming Escherichia coli cells (e.g. Escherichia coli 
cells of strain NC 1066) which have been rendered electrocompetent 
with a desired volume (e.g. 1 pi) of the yeast plasmid DNA extract 
obtained at step (ii) by electroporation; 

30 (iv) collecting the transformed Escherichia coli cells. 

Alternatively, a collection of Escherichia coli cell clones 
containing a collection of HCV genomic DNA inserts may be obtained by 
constructing the DNA library directly in the bacterial cell, such as 
disclosed in Flajolet et al. (2000). 
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Then, the bacterial recombinant cells which have been 
transformed both with a plasmid containing a bait polynucleotide 
encoding a bait polypeptide and a plasmid containing a prey 
polynucleotide encoding a prey polynucleotide is cultivated on a selective 
5 medium. 

Then, recombinant cell clones capable of growing on said 
selective medium are selected and the DNA inserts of the plasmids 
containing therein are sequenced. 

By bacterial two-hybrid system is generally intended a method 

10 that usually makes use of at least one reporter gene, the transcription of 
which is activated when a prey polypeptide and a bait polypeptide 
produced by the recombinant cell due to the triggering of the 
transcription of said at least one reporter gene when both the specific 
domain contained in one prey polypeptide and the complementary 

15 domain contained in the bait polypeptide are binding one to the other. 

The invention further pertains to a bacterial two-hybrid method 
for identifying a recombinant cell clone containing a prey polynucleotide 
encoding a prey polypeptide which binds with a SID® polypeptide of 
20 SEQ ID N°1 to 38 or a variant thereof, wherein said method comprises 
the steps of : 

a) transforming bacterial cell clones with a plasmid containing a 
SID® polynucleotide encoding a SID® polypeptide of the invention or a 
variant thereof; 

25 b) rescuing prey plasmids containing prey polynucleotides 

wherein each prey polynucleotide is a DNA fragment from the genome of 
a desired organism and wherein each prey plasmid is contained in one 
recombinant yeast cell clone of a collection of recombinant yeast cell 
clones; 

30 c) transforming the recombinant bacterial cell clones obtained in 

step a) with the plasmids rescued in step b); 

d) cultivating bacterial recombinant cells obtained in step c) on 
a selective medium; 
and 
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e) selecting recombinant cell clones which grow on said 
selective medium. 

The bacterial two-hybrid system described above may further 
comprise the step of f) characterizing the prey polynucleotide contained 
5 in each recombinant cell clone selected at step e). 

In one preferred embodiment of the yeast or bacterial two- 
hybrid methods described above, the prey polypeptide is a human 
polypeptide expressed by a mammal which is infected by the Hepatitis C 
virus, like human and monkeys, typically chimpanzees. 
10 Generally, the yeast two-hybrid method or the bacterial two- 

hybrid method as disclosed herein may be performed with prey 
polypeptides of any origin, either of viral, fungal, bacterial or mammal 
origin, i.e. either of prokaryotic or eukaryotic origin. 



In a second preferred embodiment of the two-hybrid methods 
above, the prey polypeptide is an HCV polypeptide. 

Most preferably, the prey polypeptide is encoded by a strain of 
20 the hepatitis C virus which is pathogenic for human, such as strain H77. 

SETS OF NUCLEIC ACIDS AND SETS OF POLYPEPTIDES OF THE 
INVENTION 

25 In yet another aspect, the present invention relates to a set of 

two nucleic acids consisting of: 

i) a first nucleic acid encoding a SID® polypeptide of SEQ ID 
N°1 to 39 of the invention or a variant thereof; and 

ii) a second nucleic acid encoding a prey polypeptide which 
30 binds specifically with a SID® polypeptide defined in i). 

In still a further aspect, the invention is also directed to a set of 
two polypeptides consisting of : 

i) a first polypeptide consisting of a SID® polypeptide of SEQ ID 
35 N°1 to 39 of the invention or a variant thereof; and 
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ii) a second polypeptide which binds specifically with the first 
polypeptide. 

The invention further relates to a complex formed between : 

i) a first polypeptide consisting of a SID® polypeptide of SEQ ID 
N°1 to N°38 of the invention; and 

ii) a second poplypeptide which binds specifically with the first 
polypeptide. 

The invention also relates to a protein-protein interaction 
wherein the two interacting proteins consist of a set of two polypeptides 
as defined above. 

In a preferred embodiment, the invention relates to the protein- 
protein interactions wherein the sets of two polypeptides consist of a 
SID® polypeptide of SEQ ID N°1 to 38 and an HCV polypeptide. 

When several reiterations of the two-hybrid method are 
performed and thus common SID® polypeptide and prey polypeptides 
are selected, a map of all the interactions between these polypeptides 
may be designed, that take into account of the known and/or suspected 
biological function of each of the interacting polypeptides. 

Table 1 illustrates protein-protein interaction between the SID® 
polypeptides of SEQ ID N°1 to 38 and polypeptides of SEQ ID N°77 to 
113 which are encoded by the genome of strain H77 of the hepatitis C 
virus which is pathogenic for a mammal, like human or chimpanzee. 

Thus, the data presented in table 1 disclose particular sets of 
nucleic acids as well as particular sets of polypeptides which are 
encompassed by the present invention. 

For example, table 1 discloses that the nucleic acid of SEQ ID 
N°39 encodes the SID® polypeptide of SEQ ID N°1 which contains 
exclusively (100 %) an aminoacid sequence from the Core protein of 
HCV strain H77. 

The nucleic acid of SEQ ID N°39 starts at the nucleotide in 
position 446 and ends at the nucleotide in position 600 of the HCV 
genome which is described by YANAGI et aL (1997). 

Table 1 also discloses that the SID® polypeptide of SEQ ID N°1 
is part of a set of polypeptides of the invention, wherein the second 
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polypeptide of said set of polypeptides consists of the polypeptide of 
SEQ ID N°77 which is encoded by the nucleic acid sequence of SEQ ID 
N°114, which nucleic acid sequence has 87% of its sequence which is ! 
derived from the region of the H77 strain HCV DNA encoding the Core 
5 protein. 

Thus , a particular set of polypeptides according to the invention 
consists of: 

i) the polypeptide of SEQ ID N°1 ; and 

ii) the polypeptide of SEQ ID N°77. 

io The same reasoning apply for every set of polypeptides 

disclosed in table 1, which are expressly part of the present invention. 

Similarly, a particular set of nucleic acids according to the 

i 

invention consists of : 

(i) the nucleic acid of SEQ ID N°39; and 

15 (ii) the nucleic acid of SEQ ID N°1 14. 

The same reasoning apply for every set of nucleic acids 
disclosed in table 1 , which are expressly part of the present invention. 

Thus, particular sets of two polypeptides of the invention are 
respectively SEQ ID N°77/SEQ ID N°1; SEQ ID N°78/SEQ ID N°2; SEQ 

20 ID N°78/SEQ ID N°3; SEQ ID N°79/SEQ ID N°4; SEQ ID N°80/SEQ ID 
N°5; SEQ ID N°81/SEQ ID N°6; SEQ ID N°82/SEQ ID N°7; SEQ ID 
N°83/SEQ ID N°8; SEQ ID N°84/SEQ ID N°9; SEQ ID N°85/SEQ ID 
N°10; SEQ ID N°86/SEQ ID N°11; SEQ ID N°87/SEQ ID N°12; SEQ ID i 
N°88/SEQ ID N°13; SEQ ID N°89/SEQ ID N°14; SEQ ID N°90/SEQ ID 

25 N°15; SEQ ID N°91/SEQ ID N°16; SEQ ID N°92/SEQ ID N°17; SEQ ID 
N°93/SEQ ID N°18; SEQ ID N°94/SEQ ID N°19; SEQ ID N°95/SEQ ID 
N°20; SEQ ID N°96/SEQ ID N°21; SEQ ID N°97/SEQ ID N°22; SEQ ID 
N°98/SEQ ID N°23; SEQ ID N°99/SEQ ID N°24; SEQ ID N°100/SEQ ID 
N°25. SEQ ID N°101/SEQ ID N°26. SEQ ID N°102/SEQ ID N°27; SEQ 

30 ID N°103/SEQ ID N°28. SEQ ID N°104/SEQ ID N°29; SEQ ID 
N°105/SEQ ID N°30; SEQ ID N°106/SEQ ID N°31; SEQ ID N°107/SEQ 
ID N°32; SEQ ID N°108/SEQ ID N°33; SEQ ID N°109/SEQ ID N°34; 
SEQ ID N°110/SEQ ID N°35; SEQ ID N°111/SEQ ID N°36; SEQ ID 
N°112/SEQ ID N°37; and SEQ ID N°113/SEQ ID N°38. 

35 
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Similarly, particular sets of two nucleic acids according to the 
invention are respectively: SEQ ID N°114/SEQ ID N°39; SEQ ID 
N°115/SEQ ID N°40; SEQ ID N°115/SEQ ID N°41; SEQ ID N°116/SEQ 
ID N°42; SEQ ID N°117/SEQ ID N°43; SEQ ID N°118/SEQ ID N°44; 
SEQ ID N°119/SEQ ID N°45; SEQ ID N°120/SEQ ID N°46; SEQ ID 
N°121/SEQ ID N°47; SEQ ID N°122/SEQ ID N°48; SEQ ID N°123/SEQ 
ID N°49; SEQ ID N°124/SEQ ID N°50; SEQ ID N°125/SEQ ID N°51; 
SEQ ID N°126/SEQ ID N°52; SEQ ID N°127/SEQ ID N°53; SEQ ID 
N°128/SEQ ID N°54; SEQ ID N°129/SEQ ID N°55; SEQ ID N°130/SEQ 
ID N°56; SEQ ID N°131/SEQ ID N°57; SEQ ID N°132/SEQ ID N°58; 
SEQ ID N°133/SEQ ID N°59; SEQ ID N°134/SEQ ID N°60; SEQ ID 
N°135/SEQ ID N°61; SEQ ID N°136/SEQ ID N°62; SEQ ID N°137/SEQ 
ID N°63; SEQ ID N°138/SEQ ID N°64; SEQ ID N°139/SEQ ID N°65; 
SEQ ID N°140/SEQ ID N°66; SEQ ID N°141/SEQ ID N°67; SEQ ID 
N°142/SEQ ID N°68; SEQ ID N°143/SEQ ID N°69; SEQ ID N°144/SEQ 
ID N°70. SEQ ID N°145/SEQ ID N°71; SEQ ID N°146/SEQ ID N°72. 
SEQ ID N°147/SEQ ID N°73; SEQ ID N°148/SEQ ID N°74; SEQ ID 
N°149/SEQ ID N°75 and SEQ ID N°150/SEQ ID N°76. 

The protein-protein interactions disclosed in table 1 allows the 
design of a map of interactions between various polypeptides encoded 
by the genome of the H77 strain of HCV. 

In such a Protein Interaction Map (PIM®) wherein each SID® 
polypeptide is linked to the bait polypeptide onto which it specifically 
binds, for example by an arrow. 

Such a Protein Interaction Map (PIM®) may help the one skilled 
in the art to decipher a whole metabolical and/or physiological pathway 
that is functionally active within a pathogenic strain of HCV. Protein 
Interaction Map and computable version of PIM® are part of the present 
invention. 

Therefore, in still another aspect, the present invention is 
directed to a computable readable medium (such as floppy disk, CD- 
ROM and all electronic or magnetic format which can be read by a 
computer) having stored thereon protein-protein interactions according to 
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the invention, preferably stored in a form of a Protein Interaction MAP, as 
shown, for example, in FROMONT-RACINE et al. (1997). 

tn a preferred embodiment, the invention comprises a 
computable readable medium as defined above, wherein the protein- 
5 protein interactions stored thereon are linked to annotated data base, for 
example through Internet. 

In another preferred embodiment, the invention comprises a 
data bank containing the protein-protein interactions stored thereon, said 
data bank being available on a world-wide web site. 

10 

METHODS FOR SELECTING INHIBITORS OF PROTEIN-PROTEIN 
INTERACTIONS OF THE INVENTION 

The transformed host cells as described above can also be 

is used as models so as to study the interactions between a SID® 
polypeptide of the invention and its binding partner polypeptide, or 
between a SID® polypeptide of the invention and chemical or protein 
compounds which inhibit the binding between said SID® polypeptide and 
its binding partner polypeptide. 

20 Example of a SID® polypeptide and its binding partner 

polypeptides are typically the sets of polypeptides of the invention which 
are described above. 

In particular, the transformed host cells of the invention may be 
used for the selection of molecules which interact with a SID® 

25 polypeptide as described herein, as cofactor or as inhibitor, in particular 
a competitive inhibitor, or alternatively having an agonist or antagonist 
activity on the protein-protein interaction wherein said SID® polypeptide 
is involved. Preferably, the said transformed host cells will be used as a 
model allowing, in particular, the selection of products which make it 

30 possible to prevent and/or to treat pathologies induced by the hepatitis C 
virus. 

Consequently, the invention also consists of a method for 
selecting a molecule which inhibits the protein-protein interaction of a set 
of two polypeptides as defined above, wherein said method comprises 
35 the steps of : 
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a) cultivating a recombinant host cell containing a reporter gene 
the expression of which is toxic for said recombinant host cell, said host 
cell being transformed with two vectors wherein: 

i) the first vector contains a nucleic acid comprising a 
5 polynucleotide encoding a first hybrid polypeptide containing one of said 

two-polypeptides and a DNA binding domain; 

ii) the second vector contains a nucleic acid comprising a 
polynucleotide encoding a second hybrid polypeptide containing the 
second of said two polypeptides and an activating domain capable of 

10 activating said toxic reporter gene when the first and the second hybrid 
polypeptides are interacting; 

on a selective medium containing the molecule to be ested and allowing 
the growth of said recombinant host cell when the toxic reporter gene is 
not activated; and 

15 b) selecting the molecule which inhibits the growth of the 

recombinant host cell defined in step a). 

The invention is also directed to a method for selecting a 
molecule which inhibits the protein-protein interaction of a set of two 
polypeptides as defined above, wherein said method comprises the 

20 steps of : 

a) cultivating a recombinant host cell containing a reporter gene 
the expression of which is toxic for said recombinant host cell, said host 
cell being transformed with two vectors wherein: 

i) the first vector contains a nucleic acid comprising a 
25 polynucleotide encoding a first hybrid polypeptide containing one of said 

two polypeptides and the first domain of an enzyme; 

ii) the second vector contains a nucleic acid comprising a 
polynucleotide encoding a second hybrid polypeptide containing the 
second of said two polypeptides and the second part of said enzyme 

30 capable of activating said toxic reporter gene when the first and the 
second hybrid polypeptides are interacting, said interaction recovering 
the catalytic activity of the enzyme; 

on a selective medium containing the molecule to be tested and allowing 
the growth of said recombinant host cell when the toxic gene is not 
35 activated; and 
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b) selecting the molecule which inhibits the growth of the 
recombinant host cell defined in step a). 

In a preferred embodiment, said toxic reporter gene that can be 
used for negative selection is URA3, CYH1 or CYH2 gene. 
5 For example, a method for the screening of a molecule which 

inhibits the interaction between a SID® polypeptide of the invention with 
its binding protein counterpart may comprise the following steps: 

- transform a permeabilized yeast cell with two vectors, 
respectively a first vector containing a SID® nucleic acid of the invention 

10 and a second vector containing a prey nucleic acid as defined in the 
present specification; 

- plate on top agar the transformed permeabilized yeast cells 
above on square boxes; 

- apply by spotting the candidate inhibitor molecules to test on 
is top agar as soon as it is solidified; 

- incubates, for example, overnight at 30°C, and 

- select the inhibitor compounds that allow the growth of the 
transformed yeast cells. 

The invention also provides for a kit for the screening of a 
20 molecule which inhibits the protein-protein interaction of a set of two 

polypeptides as defined above, wherein said kit comprises a 

recombinant host cell containing a reporter gene the expression of which 

is toxic for said recombinant host cell, said host cell being transformed 

with two vectors wherein: 
25 i) the first vector contains a nucleic acid comprising a 

polynucleotide encoding a first hybrid polypeptide containing one of said 

two polypeptides and a DNA binding domain; 

ii) the second vector contains a nucleic acid comprising a 

polynucleotide encoding a second hybrid polypeptide containing the 
30 second of said two polypeptides and an activating domain capable of 

activating said toxic reporter gene when the first and the second hybrid 

polypeptides are interacting. 

Another object of the invention consists of a kit for the 

screening of a molecule which inhibits the protein-protein interaction of a 
35 set of two polypeptides as defined above, wherein said kit comprises a 
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recombinant host cell containing a reporter gene the expression of which 
is toxic for said recombinant host cell, said host cell being transformed 
with two plasmids wherein: 

i) the first vector contains a nucleic acid comprising a 
5 polynucleotide encoding a first hybrid polypeptide containing one of said 

two polypeptides and the first domain of a protein; 

ii) the second vector contains a nucleic acid comprising a 
polynucleotide encoding a second hybrid polypeptide containing the 
second of said two polypeptides and the second part of said protein 

io capable of activating said toxic reporter gene when the first and the 
second hybrid polypeptides are interacting, said interaction recovering 
the activity of the protein. In the selection methods above, the 
transcription or activating domain and the DNA-binding domain may be 
derived from Gal4 and LexA respectively. 

15 In the embodiment wherein the first domain is a first part of an 

enzyme and a complementary domain is a second part of the same 
enzyme, and wherein the proximity of the two parts of the enzyme 
restores the enzyme activity and activates a reporter gene, the two parts 
of the enzymes are most preferably the T25 and T18 polypeptides that 

20 form the catalytic domain of the Bordetella pertussis adenylate cyclase. 

As an illustrative embodiment, the reporter gene is chosen 
among the group consisting of a nutritional gene or also a gene the 
expression of which is visualised by colorimetry such as His3, LacZ or 
both LacZ and His3. 

25 

MARKER COMPOUNDS OF THE INVENTION 

The Selected Interacting Domain (SID®) polypeptides of SEQ 
ID N°1 to 38 of the invention and variants thereof defined in the present 
30 specification, and which bind specifically to a polypeptide of interest (e.g. 
a bait polypeptide), are useful as reagents for detecting, labelling, 
targeting or purifying specifically a polypeptide of interest, typically a 
polypeptide encoded by HCV ( within a sample, since the SID® 
polypeptides possess properties that have never been reached using 
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conventional detection compounds, such as those of an antibody or an 
antibody fragment. 

Firstly, the SID® polypeptides of the invention possess a high 
specificity of binding to the polypeptide of interest, since a SID® 
polypeptide consists of a portion of a larger polypeptide which binds in a 
highly specific manner to the polypeptide of interest in the natural 
environment within the eukaryotic cell infected by the Hepatitis C virus. 

Secondly, the SID® polypeptide generally has a low molecular 
weight, generally from 3 kDa, and are thus easy to produce, on the one 
hand, and, on the other hand, can be easily introduced within a cell when 
the detection of the localisation or of the expression of the polypeptide of 
interest is sought. Moreover, the small size of a SID® polypeptide allows 
its passage through inner cell barriers such as the nucleus membrane, or 
the membranes surrounding the different cell organites. 

Thus, a first object of the invention consists of a marker 
compound wherein said compound comprises : 

a) a Selected Interacting Domain (SID®) polypeptide of the 
invention or a variant thereof that binds specifically to the polypeptide of 
interest; and 

b) a detectable molecule bound thereto. 

Such a marker compound is primarily useful for detecting, 
labelling or targeting a polypeptide of interest, for example a polypeptide 
of interest contained in a sample. 

A detectable molecule according to the invention comprises, or 
alternatively consists of, any molecule which produces or can be induced 
to produce a signal. The detectable molecule can be a member of the 
signal producing system that includes the signal producing means . 

The detectable molecule may be isotopic or non-isotopic. By 
way of example and not limitation, the detectable molecule can be part of 
a catalytic reaction system such as enzymes, enzyme fragments, 
enzyme substrates, enzyme inhibitors, co-enzymes, or catalysts. Part of 
a chromogen system such as fluorophores, dyes, chemiluminescers, 
luminescers, or sensitizers. A dispersible particle that can be non- 
magnetic or magnetic, a solid support, a liposome, a ligand, a receptor, a 
hapten radioactive isotope, and soforth. 
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It must be generally understood that the whole embodiments 
disclosed in the present specification involving a Selected Interacting 
Domain (SID®) polypeptide is straightfully applied also to any variant 
thereof. 

Fluorescent detectable molecules 

In one aspect of the marker compound according to the 
invention, the detectable molecule consists of a fluorescent molecule. 
Fluorescent moieties which are frequently used as labels are for example 
those described by Ichinose et al. (1991). Other fluorescent detectable 
molecules are fluorescing isothiocyanate (FITC) such as described by 
Shattil et al. (1987) or by Goding et al. (1986). The fluorescent 
detectable molecule may also comprise a phycoerythrin as taught by 
Goding et al. (1986), and Shattil et al. (1985). Other examples of 
fluorescent detectable molecules suitable for use as labels of a marker 
compound according to the invention are rhodamine isothiocyanate, 
dansyl chloride and XRITC. 

Another fluorescent detectable molecule consists of the green 
fluorescent protein (GFP) of the jelly fish Aequorea victoria, and their 
numerous fluorescent protein derivatives. 

The one skilled in the art may advantageously refer to the 
articles of CHALFIE et al. (1994) and of HEIM et al. (1994) which 
discloses the uses of GFP for the study of gene expression and protein 
localisation. The one skilled in the art may also refer to the article of 
Rizzuto et al. (1995) , which discusses the use of wild-type GFP as a tool 
for visualising subcellular organelles in cells, to the article of KAETHER 
and GERDES (1995), which reports the visualisation of protein transport 
along the secretary passway using wild-type GFP, the article of HU and 
CHENG (1995), which relates to the expression of GFP in plant cells and 
also to the article of Davis et al. (1995) which discloses the GFP 
expression in drosophilia embryos. For the use of several fluorescent 
variants of GFP, the one skilled in the art may refer to the article of 
Delagrave et al. (1995), as well as to the article of Heim et al. (1995). 
DNA encoding GFP is available commercially, for example from 
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Clontech in Palo Alto, California, USA. The one skilled in the art may use 
also humanized GFP genes such as those described in the US Patent 
N°6,020 ) 192 and also the GFP protein disclosed in the US Patent 
N°5,941,084. 

5 Another fluorescent protein that may be used in a marker 

compound according to the invention consists of the yellow fluorescent 
protein (YFP). 

A further suitable luminescent protein consists of the luciferase 

protein. 

10 

Detectable molecules exhibiting a catalytic activity 

In another embodiment of a detectable molecule included in a 
marker compound according to the invention, said detectable molecule is 
is endowed with a catalytic activity and may thus consists of enzymes and 
catalytically active enzyme fragments. Some enzymatic labels are 
described in US Patent N°3, 654,090. Such enzymes may be for example 
horse radish peroxydase (HRP), alkaline phosphatase or glutathione 
peroxydase which are well known from the one skilled in the art. 

20 

Enzymes, enzyme fragments, enzyme inhibitors, enzyme 
substrates, and other components of enzyme reaction systems can be 
used as detectable molecules. Where any of these components is used 
as a detectable molecule, a chemical reaction involving one of the 
25 components is part of the signal producing system. 

Coupled catalysts can also involve an enzyme with a non- 
enzymatic catalyst. The enzyme can produce a reactant, which 
undergoes a reaction catalysed by the non-enzymatic catalyst or the 
non-enzymatic catalyst may produce a substrate (including co-enzymes) 
30 for the enzyme. The one skilled in the art may advantageously refer to 
the US Patent N°4,160 645 which disclose a white variety of non 
enzymatic catalysts, which may be employed, the appropriate portions of 
which are incorporated therein by reference. 

The enzyme or co-enzyme employed provides the desired 
35 amplification by producing a product, which absorbs light, e.g., a tye, or 
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emits lights upon irradiation, e.g., a fluoresces Alternatively, the catalytic 
reaction can lead to direct light emission, e.g., chemiluminescence. A 
large number of enzymes and co-enzymes for providing such products 
are described in the US Patents N°4.275,149, columns 19 to 23 and 

5 N°4,31 8,980, columns 10 to 14 which disclosures are incorporated 
herein by reference. 

A number of enzyme combinations are set forth in US Patent 
N°4,275,149, columns 23 to 28 which disclosures are incorporated 
herein by reference. 

io When a single enzyme is used as the detectable molecule, or 

alternatively as comprised in the detectable molecule, such enzymes 
may find use are hydrolases, transferases, lyases, isomerases, ligases 
or synthetases and oxydoreductases. 

Alternatively, luciferases may be used such as firefly luciferase 

15 and bacterial luciferase. 

Primarily, the enzymes of choice, based on the I.U.B. 
classification are: (i) class 1. Oxydoreductases and (ii) class 3. 
Hydrolases. Most preferred oxydoreductases are (i) dehydrogenases of 
class 1.1, more particularly 1.1.1, 1.1.3. and 1.1.99 and (ii) peroxydases 

20 in class 1.11. of the hydrolases, particularly class 3.1. , more particularly 
3.1.3 and class 3.2, more particularly 3.2.1. are preferred. 

Illustrative dehydrogenases include malate dehydrogenase, 
glucose-6-phosphate dehydrogenase and lactate dehydrogenase. Of the 
oxydases, glucose oxydases is exemplary. Of the peroxydases, horse 

25 radish peroxydase is illustrative. Of the hydrolases, alkaline 
phosphatases, p-glucosydase and lysozyme are illustrative. 

Chemiluminescent detectable molecules 

30 The detectable molecule comprised within the marker 

compound according to the invention may also consist in a 
chemiluminescent moiety. The chemiluminescent source involves a 
compound, which becomes electronically excited by a chemical reaction 
and may emit light which serves at as the detectable signal or donates 

35 energy to a fluorescent acceptor. 
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A diverse number of families of compounds have been found to 
provide chemiluminescent under a variety of conditions. When family of 
compounds is 2,3-dihydro-1,4-phtalazinedinone. The most utilised 
compound is luminol, which is the 5-amino analogue of the compound 

5 above. Other members of the family include the 5-amino-6,7,8- 
trimethoxy-and the dimethylamine-[ca]benzo analogue. These 
compounds can be made to luminance with alkaline hydrogen peroxyde 
or calcium hypochlorite and base. 

Another family of compounds is the 2,4,5-triphenylimidazoles, 

10 with lophine as the common name for the parent product. 
Chemiluminescent analogues include para-dimethylamino- and para- 
methoxy-substituents. Chemiiuminescents may also be obtained with 
geridinium esters, dioxetanes and oxalates, usually oxalyl active esters, 
e.g., p-nitrophenyl and a peroxide, e.g., hydrogen peroxide, under basic 

is conditions. Alternatively, luciferins may be used in conjunction with 
luciferase or lucigenins. 

Radioactive detectable molecules 

20 In a further embodiment of a detectable molecule comprised in 

a marker compound according to the invention, said detectable molecule 
is radio-actively labelled such as with [ 3 H], [ 32 P], [and [ 125 l]. 

Colloidal metal detectable molecules 

25 

In still a further embodiment, the detectable molecule 
comprised in a marker compound according to the invention may include 
a colloidal metal particle. ColloTdai metals have been employed in 
immuno assays previously. Mostly, they consisted of either colloidal iron 

30 or gold. The one skilled in the art may advantageously refer to the 
articles of Horisberger (1981) and Martin et al. (1990). In other case, the 
metals are chosen for their colour, i.e., their presence is determined by 
their colour or electron density under an electron microscope. Both the 
colour and electron density are directly proportional to the mass of the 

35 metal colloTd. 
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STRUCTURE OF THE MARKER COMPOUNDS OF THE INVENTION 



In a first preferred embodiment of a marker compound of the 
5 invention, the detectable molecule is covalently bound to the Selected 
Interacting Domain (SID®) polypeptide of SEQ ID N°1 to SEQ ID N°38 
or a variant thereof. 

According to this specific embodiment, detectable molecules 
comprising fluorescent proteins such as GFP and YFP, enzymes or 
10 enzyme fragments such as alkaline phosphatase, glutathione 
peroxydase and horse radish peroxydase, chemiluminescent molecules, 
radioactive labels or colloidal metal particles will be preferred. 

General methods that may be used by the one skilled in the art 
for covalently binding the detectable molecules to the Selected 
is Interacting Domain (SID®) polypeptide are described in the numerous 
bibliographic references related to the preparation of the antibody 
conjugates used for carrying out immunoassays. 

In a second preferred embodiment of a marker compound 
according to the invention, the detectable molecule is non-covalently 
20 bound to the Selected Interacting Domain (SID®) polypeptide or a 
variant thereof. 

In a first preferred aspect of this second preferred embodiment, 
the detectable molecule consists of an antibody directed specifically 
against the Selected Interacting Domain (SID®) polypeptide or a variant 
25 thereof. 

The antibodies directed specifically against the Selected 
Interacting Domain (SID®) polypeptide or a variant thereof may be 
indifferently radioactivity or non radioactivity labelled. 

30 NUCLEIC ACIDS ENCODING A MARKER COMPOUND OF THE 
INVENTION. 

The present invention also relates to a nucleic acid encoding a 
marker compound as defined above. 
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Most preferred nucleic acids encompassed by the invention 
include polynucleotides that encode a marker compound wherein the 
Selected Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38 or 
a variant thereof is covalently bound to the detectable molecule and 
5 wherein the detectable molecule consists itself of a polypeptide. 

Most preferred nucleic acids are those of SEQ ID N°39 to 76. 

In a first preferred embodiment of a nucleic acid according to 
the invention, said nucleic acid encodes for a Selected Interacting 
Domain (SID®) polypeptide which is fused to a fluorescent protein, such 
10 as GFP and YFP. 

In a second preferred embodiment of a nucleic acid according 
to the invention, said nucleic acid encodes for a Selected Interacting 
Domain (SID®) polypeptide which is fused to a polypeptide endowed 
with a catalytic activity, such as an enzyme or an enzymatically active 
is enzyme fragment, like alkaline phosphatase, glutathione peroxydase and 
horse radish peroxydase. 

In a preferred embodiment, a nucleic acid encoding a marker 
compound of the invention comprises a DNA coding sequence which is 

20 transcribed and translated into said marker compound in a cell in vitro or 
in vivo when placed under the control of appropriate regulatory 
sequences. The boundaries of the coding sequence are determined by a 
start codon and a translation stop codon. A coding sequence can 
include, but is not limited to: 

25 - prokaryotic sequences, for example when the Selected 

Interacting Domain (SID®) nucleic acid and the nucleic acid fused 
thereto which encodes the detectable molecule are of prokaryotic origin; 

- prokaryotic and eukaryotic sequences, for example the nucleic 
acid encoding the detectable molecule originates from an eukaryotic host 

30 organism. 

If the coding sequence is intended for expression in an 
eukaryotic cell, a polyadenylation signal and transcription termination 
sequence will usually be located 3' to the coding sequence. 

In a most preferred embodiment of a nucleic acid sequence 
35 according to the invention, said nucleic acid sequence include a 
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regulatory region which is functional in the host organism within which 
the expression of said nucleic acid sequence is sought, wherein said 
regulatory region comprises a promoter sequence. 

"Regulatory region" means a nucleic acid sequence which 
regulates the expression of a nucleic acid. A regulatory region may 
include sequences which are naturally responsible for expressing a 
particular nucleic acid (a homologous region^ or may include sequences 
of a different origin (responsible for expressing different proteins or even 
synthetic proteins). In particular, the sequences can be sequences of 
eukaryotic or viral genes or derived sequences which stimulate or 
repress transcription of a gene in a specific or non-specific manner and 
in an inducible or non-inducible manner. Regulatory regions include 
origins of replication, RNA splice sites, enhancers, transcriptional 
termination sequences, signal sequences which direct the polypeptide 
into the secretary pathways of the target cell, and promoters. 

A "promoter sequence" is a DNA regulatory region capable of 
binding RNA polymerase in a cell and initiating transcription of a 
downstream (3* direction) coding sequence. For purposes of defining the 
present invention, the promoter sequence is bounded at its 3' terminus 
by the transcription initiation site and extends upstream (5' direction) to 
include the minimum number of bases or elements necessary to initiate 
transcription at levels detectable above background. Within the promoter 
sequence will be found a transcription initiation site (conveniently defined 
for example, by mapping with nuclease S1), as well as protein binding 
domains (consensus sequences) responsible for the binding of RNA 
polymerase. 

A coding sequence is "under the control" of transcriptional and 
translational control sequences in a cell when RNA polymerase 
transcribes the coding sequence into mRNA, which is then trans-RNA 
spliced and translated into the protein encoded by the coding sequence. 

Most preferred vectors for the expression of a marker compound of 
the invention. 




Most preferred recombinant vectors for expressing a marker 
compound of the invention include pASAA (figure 2), pACTIIst (figure 3), 

pT18 (figure 4), pUT18C (figure 5), pT25 (figure 6), pKT25 (figure 7), j 

pB5 (Figure 12) and pP6 (Figure 13) containing inserted therein a nucleic \ 

s acid encoding a Selected Interacting Domain (SID®) polypeptide as I 

defined above or a variant thereof. \ 

j 

The invention also pertains to recombinant host cells j 

transformed with a vector expressing a marker compound as defined \ 

10 above, more particularly a vector comprising inserted therein a nucleic ' 

acid encoding said marker compound, which is operably linked to I 

suitable regulation signals which are functional in the host cell wherein its i 

expression is sought. | 

i 

Preferred cells for expression purposes will be selected in ; 

i 

15 function of the objective which is sought. For example, in the ' 

embodiment wherein the production of a marker compound according to j 

the invention in large quantities is sought, the nature of the cell host used ) 
for its production is relatively indifferent, provided that large amounts of 

Selected Interacting Domain (SID®) polypeptides or marker compounds \ 

20 of the invention are produced and that optional further purification steps 

may be carried out easily. ] 

However, in the embodiment wherein the marker compound is 

recombinantly produced within a host organism for the purpose of I 

qualitative or quantitative analysis of the polypeptide of interest onto J 

25 which said marker compound specifically binds, then the host organism | 

is selected among the host organisms which are suspected to produce I 

naturally said polypeptide of interest. I 

Consequently, mammalian and human cells, as well as jj 

bacterial, yeast, fungal, insect, nematode and plant cells are cell host ^ 

so encompassed by the invention and which may be transfected either by a 
nucleic acid or a recombinant vector as defined above. 

DETECTION METHODS OF THE INVENTION 
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The present invention further relates to the use of a Selected 
Interacting (SID®) polypeptide of SEQ ID N°1 to 38 or a variant thereof 
as well as a nucleic acid encoding it for detection purposes such as 
nucleic acids of SEQ ID N°39 to 76. It is herein reminded that a Selected 
Interacting Domain (SID®) polypeptide is determined according to the 
ability of such a (SID®) polypeptide to bind in a highly specific manner to 
a given (e.g. bait) polypeptide of interest, since the aminoacid sequence 
of a SID® polypeptide is encoded by a nucleic acid, the nucleotide 
sequence of which consists of the polynucleotide sequence which is 
common to a collection of nucleic acid sequences encoding prey 
polypeptides that have been selected for their specific binding properties 
to a (bait) polypeptide of interest, such as explained above in the section 
entitled " SELECTED INTERACTING DOMAIN (SID®) 
POLYPEPTIDES 

The specific properties of a Selected Interacting Domain (SID®) 
polypeptide for binding to a given polypeptide of interest, either a viral, 
yeast, fungal, bacterial, insect, plant or mammal polypeptide, including a 
polypeptide of human origin, allow its use as a specific ligand for said 
polypeptide of interest of which the detection is sought. 

Therefore, the use of a Selected Interacting Domain (SID®) in 
any detection method known in the art and which makes use of the 
ability of a detection ligand to bind specifically to a molecule of interest, 
most preferably a polypeptide of interest, fall under the scope of the 
present invention. 

Detection methods that make use of the recognition of a 
molecule of interest, most preferably a polypeptide of interest, by a 
detection ligand are well known in the art and are primarily illustrated by 
the abundant literature that relate to immunoassays, which is 
incorporated herein by reference in its entirety. 

The one skilled in the art may particularly refer to the book of 
Maggio (1980) (Heterogeneous assays), the US Patent N°3,81 7,837 
(homogeneous Immunoassays), US Patent N° 3,993,345 
(Immunofluorescense methods), US Patent N°4,233,402 (enzyme 
channelling techniques), US Patent N°3 I 817,837 (Enzyme multiplied 
immunoassay technique), US Patent N°4, 366,241 and European Patent 
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Application N°EP-A 0 143 574 (Migration type assays), US Patent 
N°5,202 l 006, US Patent N°5,120,413 and US Patent N°5,145,567 
(Immunofixation electrophoresis, mmunoelectrophoresis), the article of 
Aguzzi et al. (1977), the article of White et al. (1986), the article of Merlini 
5 et al. (1983), the US Patent n°5,228,960 (Immunosubstraction 
electrophoresis), the articles of Chen et al. (1991), Nielsen et al. (1991) 
and the US Patent n° 5,120,413 (Capillary electrophoresis). 

Acellular detection method of the invention. 

10 

A first detection method of the invention consists of a method 
for detecting a polypeptide of interest within a sample, wherein said 
method comprises the steps of: 

a) contacting a marker compound or a plurality of marker 
15 compounds according to the invention with the sample which is 

suspected to contain the polypeptide of interest the detection of which is 
sought; 

b) detecting the complexes formed between said marker 
compound or said plurality of marker compounds and said polypeptide of 

20 interest. 

The sample which is assayed for the presence of the 
polypeptide of interest the detection of which is sought may be of any 
nature , including every sample that may be used for carrying out an 
immunoassay. 

25 In a first aspect, the sample may be any biological fluid, such as 

blood or blood separation products (e.g. serum, plasma, buffy coat), 
urine, saliva, tears. 

In a second aspect, the sample may be any isolated biological 
tissue sample, including tissue sections previously fixed for purposes of 
30 histological studies. 

In a third aspect, the sample may be a culture supernatant of a 
cell culture and a cell lysate of cultured cells. 

In a first preferred embodiment of the first detection method of 
the invention described above, the detection step b) consists of the 
35 measure of the fluorescence signal intrinsically emitted by the detectable 
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molecule. It may for exampole be taken the advantage of SID® 
polypeptides or variants thereof having in their aminoacid sequence one 
or several tryptophan aminoacid residues. 

In a second preferred embodiment of the first detection method 
of the invention detailed above, the detection step b) consists of 
submitting the detectable molecule to a source of energy at the 
excitation wavelength of said detectable molecule, and measuring the 
light emitted at the emission wavelength of said detectable molecule. 

An illustrative example of this second embodiment above is 
when the marker compound used consists of a Selected Interacting 
Domain (SID®) which is bound to a fluorescent molecule, such as the 
fluorescent proteins GFP or YFP. 

For example, in the embodiment wherein the detectable 
molecule of the marker compound of the invention which is used 
according to the first detection method above comprises, or alternatively 
consists of, a GFP protein, the detection step c) includes illuminating the 
sample tested at an emission wavelength substantially equal to 490 nm, 
and measuring the light emitted by the marker compound which is bound 
to the polypeptide of interest within the sample at an emission 
wavelength substantially equal to 510 nm. 

Preferably, the marker compounds which are not bound to the 
polypeptide of interest the detection of which is sought within the sample 
are removed before carrying out the detection step. 

In a third preferred embodiment, the detection step c) of the first 
detection method of the invention consists of measuring the catalytic 
activity of the detectable molecule. In this specific embodiment, the 
marker compound used in the detection method comprises a detectable 
molecule which comprises, or alternatively which consists of, an enzyme 
or a catalytically active enzyme fragment, such as already detailed in the 
section entitled " Marker compounds of the invention n . 

In a fourth preferred embodiment, the detection step b) consists 
of measuring the radioactivity emitted by the detectable molecule. 
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The present invention further relates to a kit for detecting a 
polypeptide of interest within a sample, wherein said kit comprises a 
marker compound according to the invention. 

Optionally, said detection kit further comprises the reagents 
5 necessary for carrying out the detection step b), such as a suitable 
substrate for the particular enzyme or a catalytically active enzyme 
fragment used, as well as suitable buffer solutions, which may be 
identical to those conventionally used for performing immunoassays. 

10 Cellular detection assay . using a recombinantly produced marker 
compound of the invention. 

As already described above, any marker compound according 
to the invention may be produced according to genetic engineering 

is techniques. Particularly, nucleic acid encoding a particular marker 
compound which binds specifically to a polypeptide of interest the 
detection of which is sought may be inserted in a vector, wherein said 
vector may be used to transfect or transform a host organism, either a 
prokaryotic or an eukaryotic cell host such as defined above. 

20 In this specific embodiment, the production of a recombinant 

marker compound of the invention is allowed within such a transfected or 
transformed host cell. Once the host cell of interest is transfected or 
transformed with such a recombinant vector and once the recombinant 
marker compound is produced within the cell host of interest, then the 

25 Selected Interacting Domain (SID®) polypeptide portion of said marker 
compound will be able to bind specifically to its specific target 
polypeptide within the cell host. In this situation, the recombinantly 
produced marker compound of the invention will predominantly be 
localised at cell sites wherein the targeted polypeptide of interest is 

30 present. 

This is the purpose of the second detection method of the 
invention which is detailed below. 

A further object of the invention consists of a method for 
detecting a polypeptide of interest within a prokaryotic or an eukaryotic 
35 cell host, wherein said method comprises the steps of : 
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a) providing a cell host to be assayed; 

b) transfecting said cell host with a nucleic acid encoding a 
marker compound of the invention, or with a recombinant vector 

5 encoding a marker compound of the invention; 

c) detecting the complexes formed between the marker 
compound expressed by the transfected cell host and the polypeptide of 
interest. * 

Because the Selected Interacting Domain (SID®) polypeptide 

10 which is part of a marker compound of the invention specifically binds to 
a polypeptide which is suspected to be naturally produced by the 
targeted cell host, the second detection method of the invention defined 
above allows a qualitative as well as a quantitative detection of this 
targeted polypeptide which is suspected to be naturally produced by the 

15 transfected target cell host under assay. 

For example, in the embodiment within which the procedure for 
selecting the Selected Interacting Domain (SID®) polypeptide which is 
part of a marker compound of the invention includes a first step wherein 
a collection of clones containing nucleic acid inserts derived from a H77 

20 strain HCV genomic DNA library is prepared, the transfection of a 
mammalian cell, preferably a human cell, with a vector encoding such a 
marker compound of the invention will allow to detect the expression of a 
human polypeptide naturally expressed within said mammalian host cell 
and which naturally interacts with the HCV viral protein from which is 

25 derived the Selected Interacting Domain (SID®) polypeptide. 

The second detection method of the invention defined above 
firstly allows the qualitative detection of the targeted polypeptide of 
interest which binds specifically with the recombinantly produced marker 
compound of the invention, and thus permits to know in which 

30 environmental conditions or at which differentiation stage the targeted 
polypeptide of interest is naturally produced within the cell host 
transfected with a vector expressing a marker compound of the 
invention. 

Secondly, this second detection method of the invention allows 
35 the localisation of the targeted polypeptide of interest within the interior 
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of the cell, including localisation in the plasma membrane, cytosol, 
nucleus and any organelle such as ribosomes, Golgi apparatus, 
lysosomes, phagosomes, endoplasmic reticulum and chloroplasts. 

The localisation of a targeted polypeptide of interest which is 

5 expressed within the cell host under assay according to the second 
detection method of the invention may be carried out by any means well 
known in the art, including using a confocal microscope. 

Thirdly, the second detection method of the invention allows 
also a quantitative analysis of the expression of the targeted polypeptide 

10 of interest within the cell host under assay, since the level of the 
detection signal produced by the detectable molecule which is part of the 
marker compound will be proportional to the number of complexes 
formed between the cell host under assay between the targeted 
polypeptide of interest and the recombinantly produced marker 

15 compound of the invention. 

Essentially, the one skilled in the art may refer to the section 
entitled " Acellular detection method of the invention " above to find the 
teachings necessary for performing the detection step c) of the second 
detection method described herein. 

20 In a first embodiment of said second detection method of the 

invention, the detection step c) consists of the measure of the 
fluorescence signal intrinsically emitted by the detectable molecule 
comprised in the recombinantly expressed marker compound of the 
invention. 

25 In a second preferred embodiment of the second detection 

method above, the detection step c) consists of submitting the detectable 
molecule to a source of energy at the excitation wavelength of said 
detectable molecule and measuring the light emitted at the emission 
wavelength of said detectable molecule. 

30 In still a further embodiment of the second detection method of 

the invention, the detection step c) consists of measuring the catalytic 
activity of the detectable molecule. 

In another embodiment, the detection step c) consists of 
measuring the radioactivity emitted by the detectable molecule. 
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In yet a further embodiment of the second detection method of 
the invention, the detection step c) allows the location of the complexes 
formed between the recombinantly produced marker compound and the 
targeted polypeptide of interest within the transfected cell host. 

5 A further object of the invention consists of a kit for detecting a 

polypeptide of interest within a prokaryotic or an eukaryotic cell host, 
wherein said kit comprises a nucleic acid encoding a marker compound 
as defined herein, or a recombinant vector containing inserted therein a 
nucleic acid encoding a marker compound of the invention. 

10 Optionally, the detection kit above may further comprise the 

reagents necessary to carry out the detection step c). 




Cellular detection method of the invention using a marker 
compound which is introduced within a cell host, 

5 There is a third detection method according to the invention 

wherein the marker compound comprising a Selected Interacting Domain 
(SID®) polypeptide OF SEQ ID N°1 to 38 or a variant thereof is 
previously produced by any means and subsequently introduced into a 
target cell host for the purpose of detecting a targeted polypeptide of 

10 interest which binds specifically with said Selected Interacting Domain 
(SID®) polypeptide. 

Thus, the invention further relates to a method for detecting a 
polypeptide of interest within a prokaryotic or an eukaryotic cell host, 
wherein said method comprises the step of : 

15 a) providing a cell host to be assayed; 

b) introducing a marker compound as defined herein within said 

cell host; 
and 

c) detecting the complexes formed between the marker 
20 compound and the polypeptide of interest within the cell host. 

Taking into account the low molecular weight of the Selected 
Interacting Domain (SID®) polypeptide selected from SEQ ID N°1 to 38 
which is part of a marker compound of the invention, when compared 
with conventional specific detection molecules such as antibodies or 
25 antibody fragments, it results that the introduction of a marker compound 
of the invention into the interior of a target ceil host will be much more 
easier to perform, as compared with the introduction within a cell host of 
a conventional marker like a labelled antibody or a labelled antibody 
fragment. 

30 According to the third detection method of the invention defined 

above, step b) of introducing the marker compound within the target cell 
host may be performed by any technique well known in the art, including 
electroporation, and the use of molecules that will facilitate the passage 
of the marker compound of the invention through the cell membranes, 

35 and typically the plasma membrane. 



DA. 

Le August 2 2000 



57 



Such molecules that facilitate the passage of a marker 
compound of the invention through cell membranes include, but are not 
limited to, penetratin, like penetratin 1.RTM (Encor, Gaithersburg, Md), 
Antenna Pediae protein, cationic lipids and cationic polyacrylates. 
5 Permeation enhancers which may be employed include bile 

salts such as sodium glycocholate and other molecules such as p- 
cyclodextrin. Bile salts are known to increase the absorption of 
macromolecules across membranes (Pontiroli et al., 1987). 

As already detailed for the second detection method of the 
io invention described in the previous section, the third detection method of 
the invention allows also the localisation of the targeted polypeptide of 
interest which is expressed by the cell host under assay, as well as the 
qualitative and quantitative analysis of the expression of said target 
polypeptide of interest. 
15 The detection step c) according to the third detection method of 

the invention described above may be carried out in the same way than 
the detection step c) of anyone of the first detection method and the 
second detection method detailed in the previous sections herein. 

In a first embodiment of the third detection method above, the 
20 detection step c) consists of the measure of the fluorescence signal 
intrinsically emitted by the detectable molecule. 

In a second embodiment, the detection step c) consists of 
submitting the detectable molecule to a source of energy at the 
excitation wavelength of said detectable molecule and measuring the 
25 light emitted at the emission wavelength of said detectable molecule. 

In a third embodiment, the detection step c) consists of 
measuring the catalytic activity of the detectable molecule. 

In a fourth embodiment, the detection step c) consists of 
measuring the radioactivity emitted by the detectable molecule. 
30 In a fifth embodiment of the third detection method of the 

invention, the detection step c) allows the location of the complexes 
formed between the marker compound and the polypeptide of interest 
within the target cell host under assay. 
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A further object of the invention consists of a kit for detecting a 
polypeptide of interest within a prokaryotic or an eukaryotic cell host, 
wherein said kit comprises a marker compound as defined herein. 

The detection kit above may further comprise the reagents 
5 necessary to carry out the detection step c). 

The detection kit above may also further comprise the reagents 
necessary to facilitate the introduction of the marker compound within 
the target cell host under assay. 

10 SOLID PHASE DETECTION METHOD USING A SELECTED 
INTERACTING DOMAIN (SID©) POLYPEPTIDE. 

In a further aspect of the invention, the use of a Selected 
Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38or a variant 

15 thereof for detection purpose include a step wherein said Selected 
Interacting Domain (SID®) polypeptide is immobilised on a suitable 
substrate before bringing a sample to be assayed in contact with the 
substrate onto which said Selected Interacting Domain (SID®) 
polypeptide has been previously immobilised. 

20 A subsequent step will consist in detecting the complexes 

formed between the Selected Interacting Domain (SID®) polypeptide 
immobilised on the substrate and the targeted polypeptide of interest the 
presence of which is suspected in the sample assayed. 

Thus, the invention also pertains to a fourth detection method 

25 which consists of a method for detecting a polypeptide or a plurality of 
polypeptides of interest within a sample, wherein said method comprises 
the steps of : 

a) providing a substrate onto which a Selected Interacting 
Domain (SID®) polypeptide or a plurality of Selected Interacting Domain 

30 (SID®) polypeptides is (are) immobilised; 

b) bringing into contact the substrate defined in a) with the 
sample to be assayed; 

c) detecting the complexes formed between the Selected 
Interacting Domain (SID®) polypeptide or the plurality of Selected 
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Interacting Domain (SID®) polypeptides and the target polypeptide or the 
plurality of target polypeptides contained in the sample. 

Substrates, supports or surfaces for immobilising protein 
molecules are well known in the art, and a lot of them have been 

5 described for performing solid phase immunoassays. 

Preferably, a plurality of Selected Interacting Domain (SID®) 
polypeptides of different aminoacid sequences choosen among the 
sequences SEQ ID N°1 to 38 are immobilised on the substrate used 
according to the fourth detection method of the invention. 

10 For example, a complete collection of Selected Interacting 

Domain (SID®) polypeptides which have been determined according to 
the methods described in the section entitled " Selected Interacting 
Domain (SID®) polypeptides " above, using nucleic acids derived from 
the H77 strain HCV genomic DNA as starting material, may be used for 

15 being immobilised on a suitable substrate. 

According to this embodiment, the collection of Selected 
Interacting Domain (SID®) polypeptides of SEQ ID N°1 to 38 are 
immobilised on the substrate in another manner, thus forming an ordered 
area of SID® polypeptides immobilised at known locations of the surface 

20 of said substrate. 

The substrate, support or surface may be a porous or a non- 
porous water insoluble material. The support can be hydrophilic or 
capable of being rendered hydrophilic and includes inorganic powders 
such as silica, magnesium sulphate, and alumina; natural polymeric 

25 materials, particularly cellulosic materials and materials derived from 
cellulose, such as fiber containing papers; synthetic or modified naturally 
occurring polymers, such as nitro-cellulose, cellulose acetate, polyvinyl 
chloride), polyacrylamide , cross-linked dextran, agarose, polyacrylate, 
polyethylene, polypropylene, poly(4-methylbutene), polystyrene, 

30 polymethacrylate, poly(ethylene terephtalate), nylon, polyvinyl butyrate), 
said materials being used by themselves or in conjunction with other 
materials; glass available as Bioglass, ceramic metals and the like. 

An ordered area onto which a plurality of Selected Interacting 
35 Domain (SID®) polypeptides are immobilised may be manufactured 
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according to the techniques disclosed in the US Patent N°5, 143,854 or 
the PCT Application n°WO 92/10092, incorporated herein by reference 
for all purposes. The combination of photolithographic and fabrication 
techniques may, for example, enable each Selected Interacting Domain 
5 (SID®) polypeptide to occupy a very small area (" site ") on the support. 
In some embodiments, the site may be as small as few microns or even 
a single Selected Interacting Domain (SID®) polypeptide. 

In a first embodiment of the fourth detection method detailed 

io above, the plurality of Selected Interacting Domain (SID®) polypeptides 
are immobilized on the substrate in an order manner. 

In a second embodiment of Selected Interacting Domain 
(SID®), the Selected Interacting Domain (SID®) polypeptide or the 
plurality of Selected Interacting Domain (SID®) polypeptides are 

is covalently bound to the substrate. 

In a third embodiment of said method, the Selected Interacting 
Domain (SID®) polypeptide or the plurality of Selected Interacting 
Domain (SID®) polypeptides are non-covalently bound to the substrate. 
According to this specific embodiment, the Selected Interacting Domain 

20 (SID®) polypeptide or the plurality of Selected Interacting Domain 
(SID®) polypeptides are covalently bound to a first ligand molecule and 
the substrate is coated with a second ligand molecule, wherein said 
second ligand molecule specifically binds to the first ligand molecule. 
According to such a specific embodiment, the first ligand may be biotin in 

25 which case the second ligand is most preferably streptavidin. 

In still a further embodiment of the fourth detection method 
according to the invention, the Selected Interacting Domain (SID®) 
polypeptide or the plurality of Selected Interacting Domain (SID®) 
polypeptides are covalently linked to a spacer, which spacer is itself also 

30 covalently bound to the substrate in order to immobilise the Selected 
Interacting Domain (SID®) polypeptide or the plurality of Selected 
Interacting Domain (SID®) polypeptides onto said substrate. Such a 
spacer may be a peptide polymer such as a poly-alanine or a poly-lysine 
peptide of 10 to 15 amino acids in length. 
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in still a further embodiment of the fourth detection method 
above, the detection step c) consists of detecting changes in the optical 
characteristics of the substrate onto which the Selected Interacting 
Domain (SID®) polypeptide or the plurality of Selected Interacting 

5 Domain (SID®) polypeptides are bound. 

In yet a further embodiment of the fourth detection method of 
the invention, the detection step c) consists of bringing into contact the 
substrate wherein complexes are formed between the targeted 
polypeptide molecule contained in the sample assayed and the Selected 

10 Interacting Domain (SID®) polypeptide or the plurality of Selected 
Interacting Domain (SID®) polypeptides bound to said support, with a 
detectable molecule having the ability to bind to such complexes. 

A further object of the invention consists of a device or an 
is apparatus for the detection of a polypeptide or a plurality of polypeptides 
of interest within a sample, wherein said device or apparatus comprises 
a substrate onto which a Selected Interacting Domain (SID®) 
polypeptide (or a plurality of Selected Interacting Domain (SID®) 
polypeptides) is (are) immobilised. 
20 Such a device or apparatus of the invention above may 

comprise or consist of a suitable substrate onto which the plurality of 
Selected Interacting Domain (SID®) polypeptides are arranged in an 
ordered manner, thus forming an area such as described above. 

25 PHARMACEUTICAL COMPOSITIONS CONTAINING A SELECTED 
INTERACTING DOMAIN (SIP®) POLYPEPTIDE, 

It results from the method according to which a Selected 
Interacting Domain (SID®) polypeptide of SEQ ID N°1 to 38 has been 
30 selected and characterized that such a Selected Interacting Domain 
(SID®) polypeptide or a variant thereof is both: 

(i) endowed with highly specific binding properties to a (bait) 
polypeptide of interest; 
and 
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(ii) devoided of the biological activity of the naturally occurring 
protein from which this Selected Interacting Domain (SID®) polypeptide 
or a variant thereof is derived. 

These original properties of a Selected Interacting Domain 
5 (SID®) polypeptide of SEQ ID N°1 to 38 or a variant thereof allow its use 
for interfering with a naturally occurring interaction between a first protein 
and a second protein within the cell of an organism by the binding of said 
Selected Interacting Domain (SID®) polypeptide specifically either to 
said first polypeptide or said second polypeptide. 
io The (SID®) polypeptides of the invention or variants thereof are 

capable of interfering with the in vivo protein-protein interactions between 
HCV proteins or between a HCV protein and a protein from the organism 
which has been infected with the Hepatitis C virus. 

For example the SID® polypeptide of SEQ ID N°2 interferes 
15 with the naturally occurring interaction between the core and the NS3 
protein HCV. Similarly, the SID® polypeptide of SEQ ID N°17 interferes 
with the interaction between the NS4A and the NS4B proteins (see table 
1)- 

Thus, another object of the invention consists of a 
20 pharmaceutical composition comprising a pharmaceutical^ effective 
amount of a Selected Interacting Domain (SID®) polypeptide or a variant 
thereof. 

The invention also relates to a pharmaceutical composition 
comprising a pharmaceutically effective amount of a nucleic acid 
25 comprising a polynucleotide encoding a Selected Interacting Domain 
(SID®) polypeptide of SEQ ID N°1 to 38 or a variant thereof which 
polynucleotide is placed under the control of an appropriate regulatory 
sequence. 

Preferred nucleic acids are the nucleotide sequences SEQ ID 
30 N°39 to 76. 

The invention also pertains to a pharmaceutical composition 
comprising a pharmaceutically effective amount of a recombinant 
expression vector comprising a polynucleotide encoding the Selected 
Interacting Domain (SID®) polypeptide or a variant thereof. 
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The invention also pertains to a method for preventing or curing 
a viral infection by a hepatitis C virus in a human or an animal, wherein 
said method comprises a step of administering to the human or animal 
body a pharmaceutical^ effective amount of a Selected Interacting 

5 Domain (SID®) polypeptide of SEQ ID N°1 to 38 or a variant thereof 
which binds to a targeted viral or mammal, typically- human protein. 

A pharmaceutical composition as described above, wherein 
said composition is administered by any route, such as intravenous 
route, intramuscular route, oral route, or mucosal route with an 

io acceptable physiological carrier and/or adjuvant, also forms part of the 
invention. 

The Selected Interacting Domain (SID®) polypeptide or a 
variant thereof as a medicament for the prevention and/or treatment of 
pathologies induced by HCV are the most preferred. 

15 The Selected Interacting Domain (SID®) polypeptides of SEQ 

ID N°1 to 38 as active ingredients of a pharmaceutical composition will 
be preferably in a soluble form combined with a pharmaceutical^ 
acceptable vehicle. 

Such compounds which can be used in a pharmaceutical 

20 composition offer a new approach for preventing and/or treating 
pathologies linked to infection by HCV. Preferably, these compounds will 
be administered by the systemic route, in particular by the intravenous 
route, by the intramuscular or intradermal route or by the oral route. 

Their modes of administration, optimum dosages and galenic 

25 forms can be determined according to the criteria generally taken into 
account in establishing a treatment suited to a patient, such as for 
example the age or body weight of the patient, the seriousness of his 
general condition, the toterance to treatment and the side effects 
observed, and the like. 

30 The identified compound can be administered to a mammal, 

including a human patient, alone or in pharmaceutical compositions 
where they are mixed with suitable carriers or excipients at 
therapeutically effective doses to treat disorders associated with 
prokaryotic micro-organism infection. Techniques for formulation and 

35 administration of the compounds of the invention may be found in 
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" Remington's Pharmaceutical Sciences " Mack Publication Co., Easton, 
PA, latest edition. 

For any Selected Interacting Domain (SID®) polypeptide or any 
variant thereof used according to the invention, the therapeutically 

5 effective dose can be estimated initially from cell culture assays. For 
example, a dose can be formulated in animal models to achieve a 
circulating concentration range that includes or encompasses a 
concentration point or range shown the desired effect in an in vitro 
system. Such information can be used to more accurately determine 

10 useful doses in humans. 

A therapeutically effective dose refers to that amount of the 
compound that results in amelioration of symptoms in a patient. Toxicity 
and therapeutic efficacy of such compounds can be determined by 
standard pharmaceutical procedures in cell cultures or experimental 

15 animals, e.g. for determining the LD50, (the dose lethal to 50% of the 
test population) and the ED50 (the dose therapeutically effective in 50% 
of the population). The dose ratio between toxic and therapeutic effects 
is the therapeutic index and it can be expressed as the ratio between 
LD50 and ED50 Compounds which exhibit high therapeutic indices are 

20 preferred. 

The data obtained from these cell culture assays and animal 
studies can be used in formulating a range of dosage for use in human. 
The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED50, with little or no toxicity. 

25 The dosage may vary within this range depending upon the dosage form 
employed and the route of administration utilised. The exact formulation, 
route of administration and dosage can be chosen by the individual 
physician in view of the patients condition. (See, e.g. Fingl et al. 1975, in 
" The Pharmacological Basis of Therapeutics CH.I). 

30 Dosage amount and interval may be adjusted individually to 

provide plasma levels of the active compound which are sufficient to 
maintain the modulating effects. Dosages necessary to achieve the 
modulating effect will depend on individual characteristics and route of 
administration. 
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The amount of composition administered will, of course, be 
dependent on the subject being treated, on the subject's weight, the 
severity of the affliction, the manner of administration and the judgement 
of the prescribing physician. 

5 The invention also pertains to a method for preventing or curing 

a viral in a human or an animal, wherein said method comprises the step 
of administering to the human or animal body a pharmaceutical^ 
effective amount of a nucleic acid comprising a polynucleotide encoding 
a Selected Interacting Domain (SD®) polypeptide of SEQ ID N°1 to 38, 

io or a variant thereof, and wherein said polynucleotide is placed under the 
control of a regulatory sequence which is functional in said human or 
said animal. 

Preferred polynucleotides are the nucleic acids of SEQ ID N°39 

to 76. 

js The invention also relates to a method for preventing or curing a 

viral or in a human or an animal, wherein said method comprises the 
step of administering to the human or animal body a pharmaceutically 
effective amount of a recombinant expression vector comprising a 
polynucleotide encoding a Selected Interacting Domain (SD®) 

20 polypeptide which binds to a viral or bacterial protein. 

Other characteristics and advantages of the invention appear in 
the remainder of the description with the examples below, without linking 
the invention in any manner. 

25 EXAMPLES: 

Preparation of a HCV genomic collection. 

1.A. Collection preparation and transformation in Escherichia coli 

30 1.A.1 Fragmentation of genomic DNA preparation. 

The genomic DNA of the infectious HCV strain H77 (Yanagi et 
al., P.N.A.S. 1997, 94, 8738-43) is fragmented in a nebulizer (GATC) for 
2 minutes at a pressure of 2 bars, precipitated and resuspended in 
water. 
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The obtained nubilized genomic DNA is successively treated 
with Mung Bean Nuclease (Biolabs) (30 minutes at 30°C), T4 DNA 
polymerase (Biolabs) (10 minutes at 37°C) and Klenow enzyme 
(Pharmacia) (10 minutes at room temperature and 1 hour at 16°C). 
5 DNA is then extracted, precipitated and resuspended in water. 

1.A.2. Ligation of linkers to blunt-ended genomic DNA 

Oligonucleotide HGX931 (5' end phosphorylated) 1 pg/pl and HGX932 
io 1 Mg/Ml- 

Sequence of the oligo HGX931: S'-GGGCCACGAA-S* (SEQ ID N°151). 
Sequence of the oligo HGX932: 5'-TTCGTGGCCCCTG-3'(SEQ ID 
N°152). 

Linkers were preincubated (5 minutes at 95°C, 10 minutes at 
is 68°C ) 15 minutes at 42°C) then cooled down at room temperature and 
ligated with genomic DNA inserts at 16°C overnight. 

Linkers were further removed on a separation column 
(Chromaspin TE 400, Clontech), according to the manufacturer's 
protocol. 

20 

1.A.3. Vector preparation 

Plasmid pP6 (see figure 13) was prepared by replacing the 
Spe1/Xho1 fragment of pGAD3S2X with the double-stranded 
oligonucleotide: 

25 

S'CTAGCCATGGCCGCAGGGGCCGCGGCCGCACTAGTGGGGATCCTTAATTAAAG 
GGCCACTGGGGCCCCCCGTACCGGCGTCCCCGGCGCCGGCGTGATCACCCCTA 
GGAATTAATTTCCCGGTGACCCCGGGGGAGCT 3' (SEQ ID N°153). 

so The pP6 vector is successively digested with Sfi1 and BamHI 

restriction enzymes (Biolabs) for 1 hour at 37°C, extracted, precipitated 
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and resuspended in water. Digested plasmid vector backbones are 
purified on a separation column (Chromaspin TE 400, Clontech), 
according to the manufacturer's protocol. 

5 1.A.4 Ligation between vector and insert of genomic DNA 

The prepared vector is ligated overnight at 15°C with the 
genomic blunt-ended DNA described in section 2 using T4 DNA ligase 
(Biolabs). The DNA is then precipitated and resuspended in water. 

io 

1.A.5. Library transformation in Escherichia colL 

Transform DNA from section 1.A.4. into Electromax DH10B 
electrocompetent ells (Gibco BRL) with Cell Porator apparatus (Gibco 
15 BRL). Add 1 ml SOC medium and incubate transformed cells at 37°C for 
1 hour. Add 9 ml volume of SOC medium per tube and plate on 
LB+ampicillin medium. Scrape colonies with liquid LB medium. Aliquot 
and freeze at -80°C . 

The obtained collection of recombinant cell clones is named 
20 HGXBHCV1. 

1.B. Collection transformation in Saccharomyces 
cerevisiae 

The Saccharomyces cerevisiae strain (Y187 (MATa Gal4A 
25 Ga180A ade2-101 His3 Leu2-3, -112 Trp1-901 Ura3-52 
URA3::UASGAL1-LacZ Met) transformed with the HGXBHCV1 HCV 
genomic DNA library. 

The plasmid DNA contained in E. coli are extracted (Qiagen) 
from aliquoted E. coli frozen cells (1.A.5.). 
30 Grow Saccharomyces cerevisiae yeast Y1 87 in YPGIu. 
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Yeast transformation is performed according to standard 
protocol (GIEST et a!. Yeast, 11, 355-360, 1995) using yeast carrier DNA 
(Clontech). This experiment leads to 10 4 to 5.10 4 cells/pg DNA. Spread 
2.1 0 4 cells on DO-Leu medium per plates. Aliquot and freeze at -80°C. 
5 The obtained collection of recombinant cell clones is named 
HGXYHCV1. 

1. C. Construction of bait plasmids 

io Plasmid pB5 (see figure 12) is prepared by replacing the 

Ncol/Sall polylinker fragment with the double-stranded oligonucleotide. 

5'CATGGCCGCAGGGGCCGCGGCCGCACTAGTGGGGATCCTTAATTAAAGGGCCA 
CTGGGGCCCCCCGGCGTCCCCGGCGCCGGCGTGATCACCCCTAGGAATTAATTT 
15 CCCGGTGACCCCGGGGGAGCT 3\( SEQ ID N°154). 

The linkered genomic DNA described in section 2 is ligated into 
pB5 that has been digested with Sfi1 restriction enzyme and DNA 
transformed into competent E. coIL Cells are grown and plasmid DNA 
20 extracted and sequenced. Those plasmids which code in-frame fusion 
proteins are used as bait plasmids. 

EXAMPLE 2 : Screening the collection with the two-hybrid in yeast 
system. 

25 

2. A. The mating protocol. 

We have chosen the mating two-hybrid in yeast system (firstly 
described by FROMONT-RACINE et al., Nature Genetics, 1997, vol. 16, 
30 277-282, Toward a functional analysis of the yeast genome through 
exhaustive two-hybrid screens) for its advantages but we could also 
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screen the HCV collection in classical two-hybrid system as described in 
Fields et al. or in a yeast reverse two-hybrid system. 

The mating procedure allows a direct selection on selective 
plates because the two fusion proteins are already produced in the 
5 parental cells. No replica plating is required. This protocol is written for 
the use of the library transformed into the Y187 strain. 

Before mating, transform S. cerevisiae (CG 1945 strain (MATa 
Ga14-542 Gal180-538 ade2-101 His3*200 Leu2-3, -112 Trp1-901 Ura3- 
52 Lys2-801 URA::GAL4 17 mers (X3)- CyC1TATA-LacZ 
10 LYS2::GAL1 UAS-GAL1 TATA-HI S3 CYH R )) according to step 1.B. and 
spread on DO-Trp medium. 

Day 1, morning: preculture 

Preculture of Y187 cells carrying the bait plasmid obtained at 
15 step 1 .C. in 20 ml DO-Trp medium. Grow at 30°C with vigorous agitation. 

Day 1, late afternoon: culture 

Measure ODeoonm of the DO-Trp pre-culture of Y187 cells 
20 carrying the bait plasmid preculture. The ODeoonm must lie between 0.1 
and 0.5 in order to correspond to a linear measurement 
Inoculate 50 ml DO-Trp at OD 60 onm 0.006/ml, grow overnight at 30°C with 
vigorous agitation. 

25 Day 2 : mating 

medium and plates 
1 YPGIu 15 cm plate 
50 ml tube with 13 ml DO-Leu-Trp-His 
1 00 ml flask with 5 ml of YPGIu 
30 8 DO-Leu-Trp-His plates 
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2 DO-Leu plates 
2 DO-Trp plates 
2 DO-Leu-Trp plates 

Measure OD 6 oonm of the DO-Trp culture. It should be around 1. 
5 For the mating, you must use twice as many bait cells as library 

cells. To get a good mating efficiency, you must collect the cells at 10 8 
cells per cm 2 

Estimate the amount of bait culture (in ml) that makes up 30 
ODeoonm units for the mating with the prey library. 
io Thaw a vial containing the HGXYHCV1 library slowly on ice. 

Add the 0.5 ml of the vial to 5 ml YPGIu. Let those cells recover at 30°C, 
under gentle agitation for 10 minutes. 
Mating 

15 Put the 30 ODeoonm units of bait culture into a 50 ml flacon tube. 

Add the HGXYHCV1 library culture to the bait culture. 
Centrifuge, discard the supernatant and resuspend in 0.8 ml YPGIu 
medium. 

Distribute the cells onto a YPGIu plate with glass beads. Spread 
20 cells by shaking the plates. 

Incubate the plate cells-up at 30°C for 4 h 30 min. 

Collection of mated cells 

25 Wash and rinse the plate with 6 ml and 7 ml consecutively of 

DO-Leu-Trp-His. 

Perform two parallel serial ten-fold dilutions in 500 pi DO-Leu- 
Trp-His up to 1/10,000. Spread out 50 pi of each 1/10000 dilution onto 
DO-Leu and DO-trp plates and 50 pi of each 1/1000 dilution onto DO- 

30 Leu-Trp plates. 
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Spread 3.2 ml of collected cells in 400 pi aliquots on DO-Leu- 
Trp-His+Tet plates. 

DAY 4 

5 

Selection of clones able to grow on DO-Leu-Trp- 
His+Tetracyclin: this medium allows us to isolate diploid clones 
presenting an interaction. 

Count the Trp+Leu+ colonies on control plates and the total 
10 number of His+ colonies on the DO-Leu-Trp-His+Tetracyclin plates. 

The number of His+ cell clones will define which protocol is to 
be processed: 

Upon 2.10 6 Trp+Leu+ colonies: 

- if number of His+cell clones < 95: then process luminometry 
15 protocol on all colonies; 

- if number of His+ cell clones > 95 and <5000: then process 
luminometry protocol on 95 colonies; 

- if number of His+ ceil clones >500: repeat screen using DO- 
Leu-Trp-His+Tetracyclin plates containing 3-aminotriazol. 

20 

2.B The iuminometrv assay 

Grow His+ colonies overnight at 30°C in microtiter plates 
containing DO-Leu-Trp-His-Tetracyclin medium with shaking. The day 
25 after, dilute 15 times overnight culture into a new microtiter plate 
containing the same medium. Incubate 5 hours at 30°C with shaking. 
Dilute samples 5 times and read OD 6 oonm. Dilute again to obtain between 
10 000 and 75 000 yeast cells/well in 100 pi final volume. 
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Per well, add 76 pi of One Step Yeast Lysis Buffer (Tropix), 20 
pi Sapphirell Enhancer (Tropix), 4 pi Galacton Star (Tropix), incubate 40 
minutes at 30°C. 

Measure the (3-Gal read-out (L) using a Luminometer (Trilux, 
5 Wallach). 

Calculate value of OD 6 oonmXL and selected interacting preys 
having highest values. 

At this step of the protocol, we have isolated diploid cell clones 
presenting interaction. The next step is now to identify polypeptides 
io involved in the selected interactions. 

EXAMPLE 3: Identification of positive clones 
3.A. PCR on yeast colonies 

15 Introduction 

PCR amplification of fragments of plasmid DNA directly on 
yeast colonies is a quick and efficient procedure to identify sequences 
cloned into this plasmid. It is directly derived from a published protocol 
(Wang H. et al M Analytical Biochemistry, 237, 145-146, 1996). However, 
20 it is not a standardized protocol: in our hands it varies from strain to 
strain, and is dependent on experimental conditions (number of cells, 
Taq polymerase source, etc). This protocol should be optimized to 
specific local conditions. 

25 MATERIALS 

- For 1 well, PCR mix composition is: 
32.5 pi water, 

5 pi 1 0X PCR buffer (Pharmacia), 
30 1 pldNTP 10 mM, 
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0,5 pi Taq polymerase (85p/pl -Pharmacia), 

0,5 pi oligonucleotide ABS1 10 pmo!e/Ml:5- 
GCGTTTGGAATCACTACAGG-3 , 1 

0,5 pi oligonucleotide ABS2 10 pmoIe/pI:5'- 
CACGATGCACGTTGAAGTG-3'. 
- 1N NaOH. 

Experiment 

Grow positive colonies overnight at 30°C on a 96 well cell 
culture cluster (Costar), containing 150 pi DO-Leu-Trp-His+Tetracyclin 
with shaking. Resuspend culture and transfer immediately 100 pi on a 
Thermowell 96 (Costar). 

Centrifuge 5 minutes at 4000 rpm at room temperature. 

Remove supernatant. Dispense 5 pi NaOH in each well, shake 

1 minute. 

Place the Thermowell in the thermocycler (GeneAmp 9700, 
Perkin Elmer) 5 minutes at 99,9°C and then 1 0 minutes at 4°C. 
In each well, add PCR mix, shake well. 
Set up the PCR program as followed: 



94°C 3 minutes 

94°C 30 seconds 

53°C 1 minute 30 seconds x 35 cycles 

72°C 3 minutes 

72°C 5 minutes 

15°C oo 



Check the quality, the quantity and the length of the PCR 
fragment on agarose gel. 
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The length of the cloned fragment is the estimated length of the 
PCR fragment minus 300 base pairs that correspond to the amplified 
flanking plasmid sequences. 

5 3.B Plasmids rescue from yeast by electroporation 

Introduction 

The previous protocol of PCR on yeast cell may not be 
io successful, in such a case, we rescue plasmids from yeast by 
electroporation. This experiment allows the recovery of prey plasmids 
from yeast cells by transformation of E.coli with a yeast cellular extract. 
We can then amplify the prey plasmid and sequence the cloned 
fragment. 

15 

Material 
Plasmid rescue 

Glass beads 425-600 |jm (Sigma) 

Phenol/chloroform (1/1) premixed with isoamyl alcohol 
20 (Amresco) 

Extraction buffer: 2% Triton X100, 1% SDS, 100 mM NaCI, 10 
mM TrisHCI pH 8,0, 1 mM EDTA pH 8.0. 

Mix ethanol/NH4Ac: 6 volumes ethanol with 7.5 M NH 4 Acetate, 
70% Ethanol and yeast cells in patches on plates. 

25 

Electroporation 

SOC medium 
M9 medium 

Selective plates: M9-Leu+Ampicillin 
30 2 mm electroporation cuvettes (Eurogentec) 
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Experiment 
Plasmid rescue 

5 Prepare cell patch on DO-Leu-Trp-His with cell culture of 

section 2.C. 

Scrape the cell of each patch in Eppendorf tube, add 300 pi of 

glass beads in each tube, then add 200 pi extraction buffer and add 

200pl phenol: chloroform:isoamyl alcohol (25:24:1). 
10 Centrifuge tubes 10 minutes at 15000 rpm. 

Transfer 180 |jl supernatant to a sterile Eppendorf tube and add to each 

500 pi ethanol/NHUAc, vortex. 

Centrifuge tubes 15 minutes, 15000 rmp at 4°C. 

Wash pellet with 200 pi 70% ethanol, remove ethanol and dry pellet, 
15 Resuspend pellet in 10 pi water. Store extracts at -20°C. 

Electroporation 

Material: Electrocompetent MC1066 cells prepared according to 
standard protocols (Maniatis). 
20 Add 1 |j| of yeast plasmid DNA-extract to pre-chilled Eppendorf tube, and 
keep on ice. 

Mix 1 pi plasmid yeast DNA-extract sample, add 20 pi electrocompetent 
cells and transfer in a cold electroporation cuvette. 

Set the Biorad electroporator on 200 ohms resistance, 25 capacity; 
25 2.5 kV. Place cuvette in the cuvette holder and electroporate. 

Add 1 ml SOC into the cuvette and transfer the cell-mix into sterile 
Eppendorf tube. 

Let cells recover for 30 minutes at 37°C, spin the cells down 1 minute, 
4000x g and pour off supernatant. Keep about 100 pi medium and use it 
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to resuspend the cells and spread them on selective plates (e.g. M9-Leu 
plates). 

Incubate plates for 36 hours at 37°C. 

Grow one colony and extract plasrnids. Check presence and size of 
5 insert through enzymatic digestion and agarose gel. Sequence insert. 

EXAMPLE 4: Protein-protein interaction. 

For each bait, the previously protocol leads to the identification 
10 of prey polynucleotide sequences. Using a suitable software program (eg 
Blastwun, available on the Internel site of the University of Washington: 
http:/bioweb. pasteur.fr/seqanal/interfaces/blastwu. html) the region of the 
HCV genome is encoded by the prey fragment may be determined and 
whether the fusion proteins encoded are in the same open reading frame 
15 of translation as the HCV polyprotein or not. 

EXAMPLE 5 : Identification of SID® 

The presence of contiguous polypeptides in the HCV genome 
20 and the high complexity of the prey library used prevents the 
determination of SID®s by previous means since prey fragments can 
overlap multiple polypeptides. The high complexity of the prey library 
used relative to the small genome size also prevented such a simple 
analysis since prey fragments can overlap multiple interacting domains. 
25 It was also necessary to overcome the problems caused by protein preys 
encoded by out-of-frame fusions of regions of the HCV genome. 

In order to determine the SID®s for a particular bait protein, it 
was therefore necessary to devise a suitable algorithm which would take 
into account all these problems: 
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5.1. The prey fragments are initially sorted according to which 
reading frame of the polypeptide sequence they correspond to. This 
enables the separation of physiologically relevant prey protein from out- 
of-frame fusions which bind in the two-hybrid assay. 

5.2. Each prey fragment is compared pairwise with other prey 
fragments and two fragments are clustered together if they overlap by 
more than 30% of their lengths (see fig. 8). Further fragments are 
assigned to the cluster if, and only if, overlap all the fragments in the 
cluster by more than 30% of their length. 

5.3 For each cluster of fragments thus produced, a pre-SID is 
defined as the intersection of all the fragments present in the cluster 
defined in 5.2 (figure 9). 

5.4. The pre-StDs defined in 5.3 are then analysed pairwise and 
if the region of intersection between two pre-SIDs is greater than 30 bp 
then a SID® is defined as this region of intersection. If the non- 
intersecting region of a pre-SID is of more than 30 bp in length and this 
non-intersecting region represents more than 30% of the length of one of 
the fragments that comprises this region, then this non-intersecting 
region is also defined as a SID®s (figure 10). 

5.5 The number of fragments contributing to each SID defined 
in 5.4 is counted. In the case of overlapping SIDs®, the SID® which 
contains the most fragments is identified, and all the fragments which 
contribute to this SID® are removed from overlapping SIDs®. The 
inspection of the fragments which remain in these overlapping SIDs® 
determines the final sequence of the SID® (figure 11). 
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TABLE 1 

Summary of the protein-proptein interactions 
between the SID polypeptides of the invention 



and H77 strain HCV polypeptides 
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TABLE 1 (continued) 
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(1) Nucleic acid sequence encoding the polypeptide from the H77 strain 
5 of HCV which binds to the SID polypeptide (4) described in the same 

line. 

(2) 5'-end and 3'-end nucleotide positions of the sequence SEQ ID (1) in 
reference to the nomenclature disclosed by Yanagi et al. (1997) 

(3) Aminoacid sequence of the polypeptide from the H77 strain of HCV 
10 which binds to the SID polypeptide (4) described in the same line. 

(4) Nucleic acid sequence encoding the SID polypeptide which binds to 
the polypeptide of the aminoacid sequence (3) described in the same 
line. 

(5) Aminoacid sequence of the SID polypeptide which binds to the 
15 polypeptide of the aminoacid sequence (3) described in the same line. 
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Claims 



10 



15 



20 



25 



1. A nucleic acid which encodes a polypeptide selected from the group 
consisting of the amino acid sequences SEQ ID N° 1 to 38 or a variant 
thereof, and a sequence complementary thereto. 

2. A nucleic acid according to claim 1 which encodes a polypeptide 
having at least 95% aminoacid identity with a polypeptide selected from 
the group consisting of the aminoacid sequences SEQ ID N°1 to 38, and 
a sequence complementary thereto. 

3. A nucleic acid according to claim 1 which is selected from the group 
consisting of the sequences SEQ ID N°39 to 76 or a sequence 
complementary thereto. 

4. A nucleic acid according to claim 1 which possesses at least 95% 
nucleic acid identity with a nucleic acid selected from the group 
consisting of the sequences SEQ ID N°39 to 76. 

5. A nucleic acid according to claim 1 encoding a polypeptide having an 
aminoacid sequence selected from the group consisting of the 
sequences consisting of at least: 



- 45 consecutive aminoacids of SEQ ID N°1; 

- 30 consecutive aminoacidss of SEQ ID N°2; 

- 65 consecutive aminoacids of SEQ ID N°3; 

- 30 consecutive aminoacids of SEQ ID N°4; 
-130 consecutive aminoacids of SEQ ID N°5; 

- 25 consecutive aminoacids of SEQ ID N°6; 

- 23 consecutive aminoacids of SEQ ID N°7. 

- 48 consecutive aminoacids of SEQ ID N°8; 

- 36 consecutive aminoacids of SEQ ID N°9; 

- 25 consecutive aminoacids of SEQ ID N°10; 
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- 24 consecutive aminoacids of SEQ ID N°1 1 ; 

- 37 consecutive aminoacids of SEQ ID N°12; 

- 25 consecutive aminoacids of SEQ ID N°13; 

- 30 consecutive aminoacids of SEQ ID N°14; 

- 27 consecutive aminoacids of SEQ ID N°15; 

- 69 consecutive aminoacids of SEQ ID N°16; 

- 1 30 consecutive aminoacids of SEQ ID N°17; 

- 33 consecutive aminoacids of SEQ ID N°18; 

- 25 consecutive aminoacids of SEQ ID N°19; 

- 40 consecutive aminoacids of SEQ ID N°20; 

- 78 consecutive aminoacids of SEQ ID N°21 ; 

- 39 consecutive aminoacids of SEQ ID N°22; 

- 57 consecutive aminoacids of SEQ ID N°23; 

- 26 consecutive aminoacids of SEQ ID N°24; 

- 68 consecutive aminoacids of SEQ ID N°25; 

- 34 consecutive aminoacids of SEQ ID N°26; 

- 42 consecutive aminoacids of SEQ ID N°27; 

- 48 consecutive aminoacids of SEQ ID N°28. 

- 102 consecutive aminoacids of SEQ ID N°29: 

- 49 consecutive aminoacids of SEQ ID N°30: 

- 92 consecutive aminoacids of SEQ ID N° 31 ; 

- 49 consecutive aminoacids of SEQ ID N°30; 

- 92 consecutive aminoacids of SEQ ID N°31; 

- 71 consecutive aminoacids of SEQ ID N°32; 

- 55 consecutive aminoacids of SEQ ID N°33; 

- 69 consecutive aminoacids of SEQ ID N°34; 

- 23 consecutive aminoacids of SEQ ID N°35; 

- 33 consecutive aminoacids of SEQ ID N°36; 

- 32 consecutive aminoacids of SEQ ID N°37; 

and 

- 22 consecutive aminoacids of SEQ ID N°38. 

6. A nucleic acid according to claim 1 encoding a polypeptide having an 
amino acid sequence comprising from one to three substitutions, 
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additions or deletions of one amino acid as regards a polypeptide 
selected from the group consisting of the amino acid sequences SEQ ID 
N°1 to 38 or as regards a polypeptide according to claim 5 and a 
sequence complementary thereto. 

5 

7. A polypeptide selected from the group consisting of the amino acid 
sequences SEQ ID N°l to 38 or a variant thereof. 

io 8. A polypeptide according to claim 7 having at least 95% aminoacid 
identity with a polypeptide selected from the group consisting of the 
aminoacid sequences SEQ ID N°1 to 38. 

9. A polypeptide according to claim 7 having an aminoacid sequence 
is selected from the group consisting of the sequences consisting of at 
least: 

- 45 consecutive aminoacids of SEQ ID N°1; 

- 30 consecutive aminoacidss of SEQ ID N°2; 

- 65 consecutive aminoacids of SEQ ID N°3; 
20 - 30 consecutive aminoacids of SEQ ID N°4; 

- 130 consecutive aminoacids of SEQ ID N°5; 

- 25 consecutive aminoacids of SEQ ID N°6; 

- 23 consecutive aminoacids of SEQ ID N°7. 

- 48 consecutive aminoacids of SEQ ID N°8; 
25 - 36 consecutive aminoacids of SEQ ID N°9; 

- 25 consecutive aminoacids of SEQ ID N°10; 

- 24 consecutive aminoacids of SEQ ID N°1 1 ; 

- 37 consecutive aminoacids of SEQ ID N°12; 

- 25 consecutive aminoacids of SEQ ID N°13; 
30 - 30 consecutive aminoacids of SEQ ID N°14; 

- 27 consecutive aminoacids of SEQ ID N°15; 

- 69 consecutive aminoacids of SEQ ID N°16; 

- 130 consecutive aminoacids of SEQ ID N°17; 
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- 33 consecutive aminoacids of SEQ ID N°18; 

- 25 consecutive aminoacids of SEQ ID N°19; 

- 40 consecutive aminoacids of SEQ ID N°20; 

- 78 consecutive aminoacids of SEQ ID N°21; 
5 - 39 consecutive aminoacids of SEQ ID N°22; 

- 57 consecutive aminoacids of SEQ ID N°23; 

- 26 consecutive aminoacids of SEQ ID N°24; 

- 68 consecutive aminoacids of SEQ ID N°25; 

- 34 consecutive aminoacids of SEQ ID N°26; 
io - 42 consecutive aminoacids of SEQ ID N°27; 

- 48 consecutive aminoacids of SEQ ID N°28. 

- 102 consecutive aminoacids of SEQ ID N°29: 



and 



- 49 consecutive aminoacids 


of 


SEQ 


ID 


N°30: 


- 92 consecutive aminoacids 


of 


SEQ 


ID 


N° 31; 


- 49 consecutive aminoacids 


of 


SEQ 


ID 


N°30; 


- 92 consecutive aminoacids 


of 


SEQ 


ID 


N°31; 


- 71 consecutive aminoacids 


of 


SEQ 


ID 


N°32; 


- 55 consecutive aminoacids 


of 


SEQ 


ID 


N°33; 


- 69 consecutive aminoacids 


of 


SEQ 


ID 


N°34; 


- 23 consecutive aminoacids 


of 


SEQ 


ID 


N°35; 


- 33 consecutive aminoacids 


of 


SEQ 


ID 


N°36; 


- 32 consecutive aminoacids 


of 


SEQ 


ID 


N°37; 


- 22 consecutive aminoacids 


of 


SEQ 


ID 


N°38. 



25 

10. A polypeptide according to claim 7 having an amino acid sequence 
comprising from one to three substitutions, additions or deletions of one 
amino acid as regards a polypeptide selected from the group consisting 
of the amino acid sequences SEQ ID N°1 to 38 or as regards a 

so polypeptide according to claim 9. 

11. An antibody directed against a polypeptide according to any one of 
claims 7 to 10. 
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12. A recombinant vector containing inserted therein a nucleic acid 
according to any one of claims 1 to 5. 

13. The recombinant vector according to claim 12 which is selected from 
5 the group consisting of the plasmids pACTIIst and pAS2AA. 

14. The recombinant vector according to claim 12 which is selected from 
the group consisting of pT25, pKT25, pUT18 and pUT18C. 

10 15. The recombinant vector according to claim 12 which is selected from 
the group consisting of pP6 and pB5. 

16. A cell host transformed with a vector according to any one of claims 
12 to 15 or with a nucleic acid according to anyone of claims 1 to 5. 

15 

17. A method for producing a polypeptide according to any one of claims 
7 to 10, wherein said method comprises the step of : 

a) cultivating a cell host according to claim 18 in an appropriate culture 
medium; 

20 b) recovering the recombinant polypeptide from the culture supernatant 
or from the cell lysate. 

18. A yeast two-hybrid system method for selecting a recombinant cell 
clone containing a vector comprising a nucleic acid insert encoding a 

25 prey polypeptide which binds with a SID® polypeptide, wherein said 
method comprises the steps of : 

a) mating at least one first recombinant yeast cell clone of a collection of 
recombinant yeast cell clones transformed with a plasmid containing the 
prey polynucleotide to be assayed with a second haploid recombinant 
30 Saccharornyces cerevisiae cell clone transformed with a plasmid 
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containing a bait polynucleotide encoding a SID® polypeptide according 
to any one of claims 7 to 10; 

b) cultivating diploid cells obtained in step a) on a selective medium; and 

c) selecting recombinant cell clones which grow on said selective 
5 medium. 

19. The yeast two-hybrid method of claim 18 which further comprises the 
step of ; 

d) characterizing the prey polynucleotide contained in each recombinant 
10 cell clone selected in step c). 

20. A bacterial two-hybrid method for identifying a recombinant cell clone 
containing a prey polynucleotide encoding a prey polypeptide which 
binds with a SID® polypeptide, wherein said method comprises the steps 

is of : 

a) transforming bacterial cell clones with a plasmid containing a SID® 
polynucleotide encoding a SID® polypeptide according to any one of 
claims 7 to 10; 

b) rescuing prey plasmids containing prey polynucleotides wherein each 
20 prey polynucleotide is a DNA fragment from the genome of a desired 

organism and wherein each prey plasmid is contained in one 
recombinant yeast cell clone of a collection of recombinant yeast cell 
clones; 

c) transforming the recombinant bacterial cell clones obtained in step a) 
25 with the plasmids rescued in step b); 

d) cultivating bacterial recombinant cells obtained in step c) on a 
selective medium; and 

e) selecting recombinant cell clones which grow on said selective 
medium. 

30 
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21. The bacterial two-hybrid method of claim 20, wherein said method 
further comprises the step of f) characterising the prey polynucleotide 
contained in each recombinant cell clone selected at step e). 

5 22. The method according to any one of claims 18 to 21, wherein the 
prey polypeptide is a human polypeptide. 

23. The method according to any one of claims 18 to 21, wherein the 
prey polypeptide is an HCV polypeptide. 

10 

24. The method of claim 23, wherein the prey polypeptide is encoded by 
a strain of HCV which is pathogenic for human. 

25. A set of two nucleic acids consisting of : 

15 i) a first nucleic acid encoding a SID® polypeptide according to any one 
of claims 7 to 10; and 

ii) a second nucleic acid encoding a prey polypeptide which binds 
specifically with the SID® polypeptide defined in i). 

20 26. A set of two nucleic acids which is selected from the group consisting 
of the following sets: 

SEQ ID N°77/SEQ ID N°1; SEQ ID N°78/SEQ ID N°2; SEQ ID 
N°78/SEQ ID N°3; SEQ ID N°79/SEQ ID N°4; SEQ ID N°80/SEQ ID N°5; 

25 SEQ ID N°81/SEQ ID N°6; SEQ ID N°82/SEQ ID N°7; SEQ ID 
N°83/SEQ ID N°8; SEQ ID N°84/SEQ ID N°9; SEQ ID N°85/SEQ ID 
N°10; SEQ ID N°86/SEQ ID N°11; SEQ ID N°87/SEQ ID N°12; SEQ ID 
N°88/SEQ ID N°13; SEQ ID N°89/SEQ ID N°14; SEQ ID N°90/SEQ ID 
N°15; SEQ ID N°91/SEQ ID N°16; SEQ ID N°92/SEQ ID N°17; SEQ ID 

30 N°93/SEQ ID N°18; SEQ ID N°94/SEQ ID N°19; SEQ ID N°95/SEQ ID 
N°20; SEQ ID N°96/SEQ ID N°21; SEQ ID N°97/SEQ ID N°22; SEQ ID 
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N°98/SEQ ID N°23; SEQ ID N°99/SEQ ID N°24; SEQ ID N°100/SEQ ID 
N°25. SEQ ID N°101/SEQ ID N°26. SEQ ID N°102/SEQ ID N°27; SEQ 
ID N°103/SEQ ID N°28. SEQ ID N°104/SEQ ID N°29; SEQ ID 
N°105/SEQ ID N°30; SEQ ID N°106/SEQ ID N°31; SEQ ID N°107/SEQ 
5 ID N°32; SEQ ID N°108/SEQ ID N°33; SEQ ID N°109/SEQ ID N°34; 
SEQ ID N°110/SEQ ID N°35; SEQ ID N°111/SEQ ID N°36; SEQ ID 
N°112/SEQ ID N°37; and SEQ ID N°113/SEQ ID N°38. 

27. A set of two polypeptides consisting of : 
10 i) a first polypeptide consisting of a SID® polypeptide according to any 
one of claims 7 to 10; and 

ii) a second polypeptide, also termed prey polypeptide, which binds 
specifically with the first polypeptide. 

15 28. A set of two polypeptides which is selected from the group consisting 
of the following sets: 

SEQ ID N°114/SEQ ID N°39; SEQ ID N°115/SEQ ID N°40; 
SEQ ID N°115/SEQ ID N°41; SEQ ID N°116/SEQ ID N°42; SEQ ID 

20 N°117/SEQ ID N°43; SEQ ID N°118/SEQ ID N°44; SEQ ID N°119/SEQ 
ID N°45; SEQ ID N°120/SEQ ID N°46; SEQ ID N°121/SEQ ID N°47; 
SEQ ID N°122/SEQ ID N°48; SEQ ID N°123/SEQ ID N°49; SEQ ID 
N°124/SEQ ID N°50; SEQ ID N°125/SEQ ID N°51; SEQ ID N°126/SEQ 
ID N°52; SEQ ID N°127/SEQ ID N°53; SEQ ID N°128/SEQ ID N°54; 

25 SEQ ID N°129/SEQ ID N°55; SEQ ID N°130/SEQ ID N°56; SEQ ID 
N°131/SEQ ID N°57; SEQ ID N°132/SEQ ID N°58; SEQ ID N°133/SEQ 
ID N°59; SEQ ID N°134/SEQ ID N°60; SEQ ID N°135/SEQ ID N°61; 
SEQ ID N°136/SEQ ID N°62; SEQ ID N°137/SEQ ID N°63; SEQ ID 
N°138/SEQ ID N°64; SEQ ID N°139/SEQ ID N°65; SEQ ID N°140/SEQ 

30 ID N°66; SEQ ID N°141/SEQ ID N°67; SEQ ID N°142/SEQ ID N°68; 
SEQ ID N°143/SEQ ID N°69; SEQ ID N°144/SEQ ID N°70. SEQ ID 
N°145/SEQ ID N°71; SEQ ID N°146/SEQ ID N°72. SEQ ID N°147/SEQ 
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ID N°73; SEQ ID N°148/SEQ ID N°74; SEQ ID N°149/SEQ ID N°75 and 
SEQ ID N°150/SEQ ID N°76. 

296. A complex formed between the two polypeptides of claim 29 or 30. 

5 

30. A method for selecting a molecule which inhibits the binding between 
a set of two polypeptides according to claim 27 or 28, wherein said 
method comprises the steps of : 

a) cultivating a recombinant host cell containing a reporter gene the 
10 expression of which is toxic for said recombinant host cell , said host cell 

being transformed with two vectors wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide 
encoding a first hybrid polypeptide containing one of said two 
polypeptides and a DNA binding domain; 
15 ii) the second vector contains a nucleic acid comprising a polynucleotide 
encoding a second hybrid polypeptide containing the second of said two 
polypeptides and an activating domain capable of activating said toxic 
reporter gene when the first and the second hybrid polypeptides are 
interacting; 

20 on a selective medium containing the molecule to be tested and allowing 
the growth of said recombinant host cell when the toxic reporter gene is 
not activated; and 

b) selecting the molecule which inhibits the growth of the recombinant 
host cell defined in step a). 

25 

31. A method for selecting a molecule which inhibits the protein-protein 
interaction of a set of two polypeptides according to claim 29 or 30, 
wherein said method comprises the steps of : 

a) cultivating a recombinant host cell containing a reporter gene the 
30 expression of which is toxic for said recombinant host cell, said host cell 
being transformed with two vectors wherein : 




i) the first vector contains a nucleic acid comprising a polynucleotide 
encoding a first hybrid polypeptide containing one of said set of two 
polypeptides and the first domain of an enzyme; 

ii) the second vector contains a nucleic acid comprising a polynucleotide 
5 encoding a second hybrid polypeptide containing the second of said two 

polypeptides and the second part of said enzyme capable of activating 
said toxic reporter gene when the first and the second hybrid 
polypeptides are interacting, said interaction recovering the catalytic 
activity of the enzyme; 
10 on a selective medium containing the molecule to be tested and allowing 
the growth of said recombinant host cell when the toxic gene is not 
activated; and 

b) selecting the molecule which inhibits the growth of the recombinant 
host cell defined in step a). 

15 

32. A kit for the screening of a molecule which inhibits the protein-protein 
interaction of a set of two polypeptides according to claim 27 or 28, 
wherein said kit comprises a recombinant cell host containing a reporter 
gene the expression of which is toxic for said recombinant cell host, said 
20 cell host being transformed with two vectors wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide 
encoding a first hybrid polypeptide containing one of said two 
polypeptides and a DNA binding domain; 

ii) the second vector contains a nucleic acid comprising a polynucleotide 
25 encoding a second hybrid polypeptide containing the second of said two 

polypeptides and an activating domain capable of activating said toxic 
reporter gene when the first and the second hybrid polypeptides are 
interacting 

30 33 A kit for the screening of a molecule which inhibits the protein-protein 
interaction of a set of two polypeptides according to claim 27 or 28, 
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wherein said kit comprises a recombinant host cell containing a reporter 
gene the expression of which is toxic for said recombinant host cell, said 
host cell being transformed with two plasmids wherein : 

i) the first vector contains a nucleic acid comprising a polynucleotide 
5 encoding a first hybrid polypeptide containing one of said two 

polypeptides and the first domain of an enzyme; 

ii) the second plasmid contains a nucleic acid comprising a 
polynucleotide encoding a second hybrid polypeptide containing the 
second of said two polypeptides and the second part of said enzyme 

10 capable of activating said toxic reporter gene when the first and the 
second hybrid polypeptides are interacting, said interaction recovering 
the catalytic activity of the enzyme. 

34. A marker compound wherein said compound comprises : 

is a) a Selected Interacting Domain (SID®) polypeptide according to any 
one of claims 7 to 10 or a variant thereof; and 
b) a detectable molecule bound thereto. 

35. The marker compound of claim 34, wherein the detectable molecule 
20 consists of a fluorescent protein. 

36. The marker compound of claim 35, wherein the detectable protein is 
selected from the group consisting of GFP and YFP. 

25 37. The marker compound of claim 35, wherein the detectable molecule 
is endowed with a catalytic activity. 

38. The marker compound of claim 37, wherein the detectable molecule 
is selected from the group consisting of a hydrolase, a transferase, a 
30 lyase, an isomerase, a ligase, a synthetase and a oxidoreductase. 



39. The marker compound of claim 34, wherein the detectable molecule 
is radioactive. 
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40. The marker compound of claim 34, wherein the detectable molecule 
is chemiluminescent. 

41. The marker compound according to any one of claims 34 to 40, 
wherein the detectable molecule is covalently bound to the Selected 
Interacting Domain (SID®) polypeptide or a variant thereof. 

42. The marker compound according to any one of claims 34 to 40, 
wherein the detectable molecule is non covalently bound to the Selected 
Interacting Domain (SID®) polypeptide or a variant thereof. 

43. The marker compound of claim 42, wherein the detectable molecule 
is an antibody directed specifically against the Selected Interacting 
Domain (SID®) polypeptide. 

44 The marker compound of claim 43, wherein said antibody is labelled 
radioactively or non radioactively. 

45. The marker compound according to claim 34, wherein : 

a) the Selected Interacting Domain (SID®) polypeptide or a variant 
thereof is covalently bound to a first ligand.; and 

b) the detectable molecule comprises a second ligand which binds 
specifically to the first ligand. 

46. The marker compound according to claim 45, wherein the first ligand 
is biotin and the second ligand is streptavidin. 

47 A nucleic acid encoding a marker compound according to any one of 
claims 34 to 41. 

48. A nucleic acid encoding the Selected Interacting Domain (SID®) 
polypeptide or a variant thereof onto which is covalently bound a first 
ligand defined in claims 45 and 46. 
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496. A recombinant vector comprising inserted therein a nucleic acid 
according to any one of claims 47 and 48. 

50. The recombinant vector according to claim 48, which is selected from 
the group consisting of pACTIIst, pASAA, pT25, pKT25, pUT18, 
pUT18C, pP6 and pB5.. 

51. A recombinant host cell which has been transfected with a nucleic 
acid according to any one of claims 47 and 48 or a recombinant vector 
according to any one of claims 43 and 44. 

52. The recombinant host cell according to claim 51 which is of 
prokaryotic origin. 

53. The recombinant host cell according to claim 51 which is of 
eukaryotic origin. 

54. The recombinant host cell according to claim 52 which is a 
mammalian host cell. 

53 A method for detecting a polypeptide of interest within a sample, 
which comprises the steps of : 

a) contacting a marker compound or a plurality of marker compounds 
according to any one of claims 34 to 46 with the sample; 

b) detecting the complexes formed between said marker compound or 
said plurality of marker compounds and said polypeptide of interest. 

56. A kit for detecting a polypeptide of interest within a sample, which 
comprises a marker compound according to any one of claims 34 to 46. 

57. A method for detecting a polypeptide of interest within a prokaryotic 
or an eukaryotic host cell, said method comprising the steps of : 

a) providing a cell host to be assayed; 
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b) transfecting said host cell with a nucleic acid according to any one 
of claims 41 and 42 or with a recombinant vector according to any one 
of claims 496 and 50; 

c) detecting the complexes formed between the marker compound 
5 expressed by the transfected cell host and the polypeptide of interest. 

58. A kit for detecting a polypeptide of interest within a prokaryotic or an 
eukaryotic host cell which comprises a nucleic acid according to any one 
of claims 47 and 48 or a recombinant vector according to any one of 

io claims 496 and 50. 

59. A method for detecting a polypeptide of interest within a prokaryotic 
or eukaryotic host cell, said method comprising the steps of : 

a) providing a cell host to be assayed; 
15 b) introducing a marker compound according to any one of claims 36 
to 48 within said cell host; 

c) detecting the complexes formed between the marker compound 
and the polypeptide of interest within the cell. 

20 60. A kit for detecting a polypeptide of interest within a prokaryotic or 
eukaryotic host cell comprising a marker compound according to any one 
of claims 34 to 46. 

61. A method for detecting a polypeptide or a plurality of polypeptides of 
25 interest within a sample, wherein said method comprises the steps of : 

a) providing a substrate onto which a Selected Interacting Domain 
(SID®) polypeptide according to any one of claims 7 to 10 or a variant 
thereof, or a plurality of Selected Interacting Domain (SID®) 
polypeptides according to any one of claims 7 to 10 or variants thereof 

30 is (are) immobilised; 

b) bringing into contact the substrate defined in a) with the sample to 
be assayed; 

c) detecting the complexes formed between the Selected Interacting 
Domain (SID®) polypeptides or variants thereof or a variant thereof, or 
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the plurality of Selected Interacting Domain (SID®) polypeptide and a 
molecule or a plurality of molecules initially contained in the sample; 

62. The method of claim 61, wherein a plurality of Selected Interacting 
Domain (SID®) polypeptides or variants thereof are immobilised on the 
substrate in an ordered manner. 

63. The method of claim 61, wherein the Selected Interacting Domain 
(SID®) polypeptide or a variant thereof, or the plurality of Selected 
Interacting Domain (SID®) polypeptides or variants thereof are 
covalently bound to the substrate. 

64. The method of claim 61, wherein the Selected Interacting Domain 
(SID®) polypeptide or a variant thereof, or the plurality of Selected 
Interacting Domain (SID®) polypeptides or a variant thereof are non- 
covalently bound to the substrate. 

65. The method of claim 61, wherein the Selected Interacting Domain 
(SID®) polypeptide or a variant thereof, or the plurality of Selected 
Interacting Domain (SID®) polypeptides or variants thereof are 
covalently bound to a first ligand and wherein the substrate is coated 
with a second ligand which specifically binds to the first ligand. 

66. The method of claim 61, wherein the first ligand is biotin and the 
second ligand is streptavidin. 

67. The method according to any one of claims 61 to 66, wherein the 
Selected Interacting Domain (SID®) polypeptide or a variant thereof, or 
the plurality of Selected Interacting Domain (SID®) polypeptides or 
variants thereof are covalently linked to a spacer and wherein said 
spacer is covalently bound to the substrate in order to immobilise the 
Selected Interacting Domain (SID®) polypeptide, or a variant thereof or 
the plurality of Selected Interacting Domain (SID®) polypeptides. 
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68.. The method according to any one of claims 61 to 66 wherein the 
detection step c) consists of detecting changes in optical characteristics 
of the substrate. 

5 69. A device for the detection of a polypeptide or a plurality of 
polypeptides of interest within a sample, wherein said device comprises 
a substrate onto which a Selected Interacting Domain (SID®) 
polypeptide according to any one of claims 7 to 10 or a variant thereof or 
a plurality of Selected Interacting Domain (SID®) polypeptides according 

10 to any one of claims 7 to 10 or variants thereof is (are) immobilised. 

70. A pharmaceutical composition comprising a pharmaceutical ly 
effective amount of a nucleic acid comprising a polynucleotide encoding 
a Selected Interacting Domain (SID®) polypeptide according to any one 

15 of claims 7 to 1 0 or a variant thereof. 

71 . A method for preventing or curing a viral infection by a Hepatitis C 
virus in a human or an animal, wherein said method comprises a step of 
administering to the human or animal body a pharmaceutical^ effective 

20 amount of a Selected Interacting Domain (SID®) polypeptide according 
to any ine of claims 7 to 10. 

72. A method for preventing or curing a viral infection by a Hepatitis C 
virus in a human or an animal, wherein said method comprises a step of 

25 administering to the human or animal body a pharmaceutical^ effective 
amount of a nucleic acid comprising a polynucleotide encoding a 
Selected Interacting Domain (SID®) polypeptide according to any one of 
claims 7 to 10, and wherein said polynucleotide is placed under the 
control of regulatory sequence which is functional in said human or said 

30 animal. 

73. A method for preventing or curing a viral or a bacterial infection in a 
human or an animal, wherein said method comprises a step of 
administering to the human or animal body a pharmaceutical^ effective 

35 amount of a recombinant expression vector comprising a polynucleotide 
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encoding a Selected Interacting Domain (SID®) polypeptide according to 
any one of claims 7 to 10. 

74. A method for selecting a SID® polypeptide comprising the steps of : 

5 1) Selecting a collection of nucleic acids (prey nucleic acids) 

which bind specifically to a given bait polypeptide of interest; and 

2) determining the nucleic acid sequences which encode for 
SID® polypeptides after having generated sets of polynucleotides from 
io the collection of nucleic acids selected at step 1). 

75. The method of claim 74, wherein step 1) consists of a yeast two- 
hybrid method or a bacterial two-hybrid method. 

is 76. The method of claim 74, wherein step 2) comprises the following 
steps of : 

a) selecting from the collection of prey polynucleotides obtained 
at the end of step 1) all prey polynucleotides encoding a prey polypeptide 
capable of interacting with said bait polypeptide and containing a 

20 common nucleic acid fragment; 

b) aligning the nucleotide sequences of the prey 
polynucleotides selected at step a) and gathering in one set or in a 
plurality of sets of sequences those nucleotide sequences which have 
sequences that overlap for more than 30% of their respective nucleic 

25 acid length, wherein each common overlapping nucleotide sequence in 
one set of sequences defines a sequence encoding a pre-SID® 
polypeptide ; and 

c) aligning two sequences encoding two respective pre-SID® 
polypeptides, and: 

30 (i) defining an overlapping nucleic acid sequence between the 

sequences encoding the two respective pre-SID® polypeptides as a 
sequence encoding a SID® polypeptide, provided that the overlapping 
sequence is of at least 30 nucleotides in length; 

ii) defining a non-overlapping nucleic acid sequence between 

35 the sequences encoding the two respective pre-SID® polypeptides as a 




sequence encoding a SID® polypeptide, provided that (1) said non- 
overlapping sequence has more than 30 nucleotides in length and (2) 
said non-overlapping sequence represents at least 30% in length of any 
one of the polynucleotides contained in the set of prey polynucleotides 
5 used for defining the sequence encoding each pre-SID® polypeptide. 

77. The method of claim 76 wherein step 2) further comprises the steps 
of: 

d) counting the number of overlapping prey polynucleotides 
10 contained in a first set of polynucleotides defining a sequence encoding 

a first SID® polypeptide; 

e) counting the number of overlapping prey polynucleotides 
contained in a second set of polynucleotides defining a sequence 
encoding a second SID® polypeptide which overlaps with the sequence 

15 encoding the first SID® polypeptide; 

f) determining which sequence among those encoding 
respectively the first SID® polypeptide and the second SID® polypeptide 
has been defined with the largest number of prey polynucleotides and 
selecting this set of prey sequences; 

20 g) adding to the set of prey sequences selected at step f) those 

sequences that were contained in the set of prey sequences used for 
defining the sequence encoding the SID® polypeptide with the smallest 
number of prey sequences and which overlap with the sequence 
encoding the SID® polypeptide with the largest number of prey 

25 sequences; 

h) aligning the prey sequences added at step g) with the 
sequences already contained in the set of prey sequences which defined 
the sequence encoding the SID® polypeptide with the largest number of 
prey sequences; 

30 i) defining an overlapping sequence between the whole 

sequences which were aligned in step h), wherein said overlapping 
sequence consists of a sequence encoding a SID® polypeptide. 
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78. The method according to any one of claims 74 to 76, wherein the 
collection of prey nucleic acids is prepared starting from the genomic 
DNA of an organism containing contiguous Open Reading Frames. 

79. The method according to claim 78, wherein said organism is a virus. 

80. The method according to claim 79, wherein the virus consists of the 
Hepatitis C virus. 

81. The method according to claim 80, wherein the Hepatitis C virus is 
pathogenic for a mammal, including human. 

82. A SID® nucleic acid selected according to the method of any one of 
claims 74 to 81. 

83. A SID® polypeptide encoded by a nucleic acid according to claim 82. 
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ABSTRACT 



SID® NUCLEIC ACIDS AND POLYPEPTIDES 
SELECTED FROM A PATHOGENIC STRAIN OF 
HEPATITIS C VIRUS AND APPLICATIONS 
THEREOF 

The present invention relates to nucleic acids encoding SID® 
polypeptides which bind selectively to a polypeptide encoded by a 
pathogenic strain of the hepatitis C virus, as well as to the SID® 
polypeptides which are encoded by said nucleic acids. 

The invention also concerns vectors comprising a nucleic acid 
encoding a SID® polypeptide as well as host cells transformed with such 
vectors. 

The invention is also directed to two-hybrid methods which 
make use of the nucleic acids encoding a SID® polypeptide selected 
from a pathogenic strain of the hepatitis C virus as well as to methods for 
selecting molecules which inhibit the binding between a SID® 
polypeptide and a polypeptide which specifically binds thereto. 
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SEQUENCE LISTING 

<110> HYBRIGENICS S.A. 

<120> SID nucleic acids and polypeptides selected from a 
pathogenic strain of the hepatitis C virus and 
applications 

<130> Hybrigenics - SID HCV 

<140> 
<141> 

<160> 156 

<17 0> Patentln Ver. 2.1 

<210> 1 
<211> 50 
<212> PRT 

<213> Hepatitis C virus 
<400> 1 

Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys 
15 10 15 

Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys 
20 25 30 

Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp 
35 40 45 



Pro Leu 
50 



<210=> 2 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 
<400> 2 

Gly Arg Gly Lys Pro Gly He Tyr 
1 5 

Pro Ser Gly Met Phe Asp Ser Ser 
20 

Gly Cys Ala 
35 



Arg Phe Val Ala Pro Gly Glu Arg 
10 15 

Val Leu Cys Glu Cys Tyr Asp Ala 
25 30 



0 
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<210> 3 
<211> 77 
<212> PRT 

<213> Hepatitis C virus 
<400> 3 

Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin 
15 10 15 

lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 
20 25 30 

Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg 
35 40 45 

Arg Gin Pro lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala 
50 55 60 

Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 
65 70 75 



<210> 4 
<211> 37 
<212> PRT 

<213> Hepatitis C virus 
<400> 4 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
15 10 15 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr 
20 25 30 

Arg Pro Pro Leu Gly 
35 



<210> 5 
<211> 150 
<212> PRT 

<213> Hepatitis C virus 
<400> 5 

Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly lie 
15 10 15 

Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser 
20 25 30 



Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu 
35 40 45 
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Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro 
50 55 60 

Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe 
65 70 75 80 

Thr Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin 
85 90 95 

Ser Gly Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys 
100 105 110 

Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys 
115 120 125 

Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr 
130 135 140 

Arg Leu Gly Ala Val Gin 
145 150 



<210> 6 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 6 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
15 10 15 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val 
20 25 



<210> 7 
<211> 26 
<212> PRT 

<213> Hepatitis C virus 
<400> 7 

Pro Pro Arg Pro Cys Gly Xle Val Pro Ala Lys Ser Val Cys Gly Pro 
15 10 15 

Val Tyr Cys Phe Thr Pro Ser Pro Val Val 
20 25 



<210> 8 
<211> 54 



Printed:08-1 1-2001 



SEQL 



00402225 



<212> PRT 

<213> Hepatitis C virus 
<400> 8 

Cys Val Val lie Val Gly Arg He Val Leu Ser Gly Lys Pro Ala lie 
15 10 15 

lie Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu 
20 25 30 

Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu 
35 40 45 

Gin Phe Lys Gin Lys Ala 
50 



<210> 9 
<211> 40 
<212> PRT 

<213> Hepatitis C virus 
<400> 9 

Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr 
15 10 15 

Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Leu 
20 25 30 



Pro Gin Asp Ala Val Ser Arg Thr 
35 40 



<210> 10 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 10 

Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
1 5 10 15 

Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala 
20 25 



<210> 11 
<211> 27 
<212> PRT 

<213> Hepatitis C virus 
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<400> 11 

Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro 
15 10 15 

Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
20 25 



<210> 12 
<211> 42 
<212> PRT 

<213> Hepatitis C virus 
<400> 12 

Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 
15 10 15 

Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
20 25 30 

Leu lie Val Phe Pro Asp Leu Gly Val Arg 
35 40 



<210> 13 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 13 

Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala 
15 10 15 

Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val 
20 25 



<210> 14 
<211> 33 
<212> PRT 

<213> Hepatitis C virus 
<400> 14 

Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly lie Val 
15 10 15 

Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro 
20 25 30 

Val 



<210> 15 
<211> 31 
<212> PRT 

<213> Hepatitis C virus 
<400> 15 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
15 10 15 

Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn 
20 25 30 



<210> 16 
<211> 77 
<212> PRT 

<213> Hepatitis C virus 
<400> 16 

Ala Gly Ala Leu Val Ala Phe Lys 
1 5 

Thr Glu Asp Leu Val Asn Leu Leu 
20 

Leu Val Val Gly Val Val Cys Ala 
35 40 

Pro Gly Glu Gly Ala Val Gin Trp 
50 55 

Ser Arg Gly Asn His Val Ser Pro 
65 70 



lie Met Ser Gly Glu Val Pro Ser 
10 15 

Pro Ala lie Leu Ser Pro Gly Ala 
25 30 

Ala He Leu Arg Arg His Val Gly 
45 

Met Asn Arg Leu lie Ala Phe Ala 
60 

Thr His Tyr Val Pro 
75 



<210> 17 
<211> 147 
<212> PRT 

<213> Hepatitis C virus 
<400> 17 

Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr Cys 
15 10 15 

He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg Thr 
20 25 30 



He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val Asp 
35 40 45 
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Gin Asp Leu Val Gly Trp Pro Ala 
50 55 

Pro Cys Thr Cys Gly Ser Ser Asp 
65 70 

Asp Val lie Pro Val Arg Arg Arg 

35 

Ser Pro Arg Pro lie Ser Tyr Leu 
100 

Leu Cys Pro Ala Gly His Ala Val 
115 120 

Tlir Arg Gly Val Ala Lys Ala Val 
130 135 

Gly Thr Thr 
145 



Pro Gin Gly Ser Arg Ser Leu Thr 
60 

Leu Tyr Leu Val Thr Arg His Ala 
75 80 

Gly Asp Ser Arg Gly Ser Leu Leu 
90 95 

Lys Gly Ser Ser Gly Gly Pro Leu 
105 110 

Gly Leu Phe Arg Ala Ala Val Cys 
125 

Asp Phe lie Pro Val Glu Asn Leu 
140 



<210> 18 
<211> 36 
<212> PRT 

<213> Hepatitis C virus 
<400> 18 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys 
1 5 10 15 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie 
20 25 30 

Cys Glu Val Leu 
35 



<210> 19 
<211> 28 
<212> PRT 

<213> Hepatitis C virus 
<400> 19 

Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly 
15 10 15 

Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp 
20 25 
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<210> 20 

<:211> 45 

<212> PRT 

<2X3> Hepatitis C virus 

<400> 20 

Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
1 5 10 15 

Leu Leu Ser Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe 
20 25 30 

Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie 
35 40 45 



<210> 21 
<211> 86 
<212> PRT 

<213> Hepatitis C virus 
<400> 21 

Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val 
15 10 15 

Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly 
20 25 30 

Val Ala Gly Ala Leu Val Ala Phe Lys lie Met Ser Gly Glu Val Pro 
35 40 45 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly 
50 55 60 

Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His Val 
65 70 75 80 

Gly Pro Gly Glu Gly Ala 

85 



<210> 22 
<211> 43 
<212> PRT 

<213> Hepatitis C virus 



<400> 22 

Gly He Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr 
15 10 15 



Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 
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20 25 30 



Tyr Ser Trp Gly Ala Asn Asp Thr Asp Val Phe 
35 40 



<210> 23 
<211> 63 
<212> PRT 

<213> Hepatitis C virus 
<400> 23 

Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala 
15 10 15 

Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp 
20 25 30 

Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly Ala Leu Val Val 
35 40 45 

Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly 
50 55 60 



<210> 24 
<211> 29 
<212> PRT 

<213> Hepatitis C virus 
<400> 24 

Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly lie Val 
15 10 15 

Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr 
20 25 



<210> 25 
<211> 76 
<212> PRT 

<213> Hepatitis C virus 
<400> 25 

Arg Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
15 10 15 

His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser 
20 25 30 

Pro Thr Ala Ala Leu Val Val Ala Gin Leu Leu Arg He Pro Gin Ala 
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35 40 45 

lie Met Asp Met lie Ala Gly Ala His Trp Gly Val Leu Ala Gly lie 
50 55 60 

Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 
65 70 75 



<210> 26 
<211> 37 
<212> PRT 

<213> Hepatitis C virus 



<400> 26 

Ala lie Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His 
15 10 15 

Gin Trp lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu 
20 25 30 



Arg Asp lie Trp Asp 
35 



<210> 27 
<211> 47 
<212> PRT 

<213> Hepatitis C virus 
<400> 27 

Val Ser Arg Thr Gin Arg Arg Gly 
1 5 

lie Tyr Arg Phe Val Ala Pro Gly 
20 

Ser Ser Val Leu Cys Glu Cys Tyr 

35 40 



Arg Thr Gly Arg Gly Lys Pro Gly 
10 15 

Glu Arg Pro Ser Gly Met Phe Asp 
25 30 

Asp Ala Gly Cys Ala Trp Tyr 
45 



<210> 28 
<211> 53 
<212> PRT 

<213> Hepatitis C virus 



<400> 28 

Leu Pro Ala Pro Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu 
15 10 15 

Glu Tyr Val Glu lie Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly 
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20 25 30 

Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu 
35 40 45 



Phe Phe Thr Glu Leu 
50 



<210> 29 
<211> 112 
<212> PRT 

<213> Hepatitis C virus 
<400> 29 

Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val lie Lys 
15 10 15 

Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu 
20 25 30 

Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr 
35 40 45 

Arg Gly Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val Val Val 
50 55 60 

Val Ser Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser 
65 70 75 80 

Val lie Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu 

85 90 95 

Asp Pro Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val 
100 105 110 



<210> 30 
<211> 54 
<212> PRT 

<213> Hepatitis C virus 
<400> 30 

Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 
15 10 15 

Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 
20 25 30 
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Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val 
35 40 45 

Thr Ala lie Leu Ser Ser 
50 



<210> 31 
<211> 102 
<212> PRT 

<213> Hepatitis C virus 
<400> 31 

Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr 
15 10 15 

Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly 
20 25 30 

Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala 
35 40 45 

Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val 
50 55 60 

Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu 
65 70 75 80 

Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser 

85 90 95 

Ala Pro Pro Gly Asp Pro 
100 



<210> 32 
<211> 79 
<212> PRT 

<213> Hepatitis C virus 
<400> 32 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
15 10 15 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
20 25 30 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
35 40 45 



Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
50 55 60 
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Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp 
65 70 75 



<210> 33 
<211> 61 
<212> PRT 

<213> Hepatitis C virus 
<400> 33 

Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 
15 10 15 

Thr Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg 
20 25 30 

Thr Gly Arg Gly Lys Pro Gly lie Tyr Arg Phe Val Ala Pro Gly Glu 
35 40 45 

Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu 
50 55 60 



<210> 34 
<211> 77 
<212> PRT 

<213> Hepatitis C virus 
<400> 34 

Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala 
15 10 15 

Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser 
20 25 30 

Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys 
35 40 45 

Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn Ser Val 
50 55 60 

Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp 
65 70 75 



<210> 35 
<211> 26 
<212> PRT 

<213> Hepatitis C virus 
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<400> 35 

Tyr Pro Pro Arg Pro Cys Gly lie Val Pro Ala Lys Ser Val Cys Gly 
15 10 15 

Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 
20 25 



<210> 36 
<211> 37 
<212> PRT 

<213> Hepatitis C virus 
<400> 36 

Pro lie Ser Tyr Ala Asn Gly Ser 
1 5 

Trp His Tyr Pro Pro Arg Pro Cys 
20 

Cys Gly Pro Val Tyr 
35 



Gly Leu Asp Glu Arg Pro Tyr Cys 

10 15 

Gly He Val Pro Ala Lys Ser Val 
25 30 



<210> 37 
<211> 35 
<212> PRT 

<213> Hepatitis C virus 

<400> 37 

Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu 
15 10 15 

Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp 
20 25 30 

lie Cys Glu 
35 



<210> 38 
<211> 25 
<212> PRT 

<213> Hepatitis C virus 
<400> 38 

Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 
15 10 15 

Cys Ser Gly Ser Trp Leu Arg Asp lie 
20 25 
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<210> 39 
<211> 152 
<212> DNA 

<213> Hepatitis C virus 
! <400> 39 

' cttgttgccg cgcaggggcc ctagattggg 

gtcgcaacct cgaggtagac gtcagcctat 
ctgggctcag cccgggtacc cttggcccct 



tgtgcgcgcg acgaggaaga cttccgagcg 60 
ccccaaggca cgtcggcccg agggcaggac 120 
ct 152 



<210> 40 
<211> 106 
<212> DNA 

<213> Hepatitis C virus 
<400> 40 

tggcaggggg aagccaggca tctatagatt tgtggcaccg ggggagcgcc cctccggcat 60 
gttcgactcg tccgtcctct gtgagtgcta tgacgcgggc tgtgct 106 



<210> 41 
<211> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 41 

taacaccaac cgtcgcccac aggacgtcaa gttcccgggt ggcggtcaga tcgttggtgg 6 0 
agtttacttg ttgccgcgca ggggccctag attgggtgtg cgcgcgacga ggaagacttc 12 0 
cgagcggtcg caacctcgag gtagacgtca gcctatcccc aaggcacgtc ggcccgaggg 18 0 
caggacctgg gctcagcccg ggtacccttg gcccctctat ggcaatgagg gttg 234 



<210> 42 
<211> 114 
<212> DNA 

<213> Hepatitis C virus 
<400> 42 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
tgcaaatgat acggatgtct tcgtccttaa caacaccagg ccaccgctgg gcaa 114 



<210> 43 
<211> 453 
<212> DNA 

<213> Hepatitis C virus 
<400> 43 

ctccaggact caacgccggg gcaggactgg cagggggaag ccaggcatct atagatttgt 60 
ggcaccgggg gagcgcccct ccggcatgtt cgactcgtcc gtcctctgtg agtgctatga 120 
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cgcgggctgt gcttggtatg agctcacgcc cgccgagact acagttaggc tacgagcgta 18 0 
catgaacacc ccggggcttc ccgtgtgcca ggaccatctt gaattttggg agggcgtctt 24 0 
tacgggcctc actcatatag atgcccactt tttatcccag acaaagcaga gtggggagaa 3 00 
ctttccttac ctggtagcgt accaagccac cgtgtgcgct agggctcaag cccctccccc 3 60 
atcgtgggac cagatgtgga agtgtttgat ccgccttaaa cccaccctcc atgggccaac 42 0 
acccctgcta tacagactgg gcgctgttca gaa 453 



<210> 44 
<211> 85 
<212> DNA 

<213> Hepatitis C virus 
<400> 44 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
tgcaaatgat acggatgtct tcgtc 8 5 



<210> 45 
<211> 80 
<212> DNA 

<213> Hepatitis C virus 
<400> 45 

ccctccaaga ccttgtggca ttgtgcccgc aaagagcgtg tgtggcccgg tatattgctt 60 
cactcccagc cccgtggtgg 8 0 



<210> 46 
<211> 165 
<212> DNA 

<213> Hepatitis C virus 



<400> 46 

ctgcgtggtc atagtgggca ggatcgtctt 
ggaggttctc taccaggagt tcgatgagat 
cgagcaaggg atgatgctcg ctgagcagtt 



gtccgggaag ccggcaatta tacctgacag 60 
ggaagagtgc tctcagcact taccgtacat 12 0 
caagcagaag gccct 16 5 



<210> 47 
<211> 123 
<212> DNA 

<213> Hepatitis C virus 



<400> 47 

cggcgacttc gactctgtga tagactgcaa 
ccttgaccct acctttacca ttgagacaac 
tea 



cacgtgtgtc actcagacag tcgatttcag 60 
cacgctcccc caggatgetg tctccaggac 12 0 

123 



<210> 48 
<211> 87 
<212> DNA 

<213> Hepatitis C virus 



law 
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<400> 48 

ggagcgcccc tccggcatgt tcgactcgtc cgtcctctgt gagtgctatg acgcgggctg 60 
tgcttggtat gagctcacgc ccgccga 8 7 



<210> 49 
<211> 84 
<212> DNA 

<213> Hepatitis C virus 
<400> 49 

cagggggaag ccaggcatct atagatttgt ggcaccgggg gagcgcccct ccggcatgtt 60 
cgactcgtcc gtcctctgtg agtg 84 



i <210> 50 

<211> 128 
<212> DNA 

<213> Hepatitis C virus 
<400> 50 

tctggaagac agtgtaacac caatagacac taccatcatg gccaagaacg aggttttctg 60 
cgttcagcct gagaaggggg gtcgtaagcc agctcgtctc atcgtgttcc ccgacctggg 12 0 
' cgtgcgcg 12 8 



<210> 51 
<211> 85 
<212> DNA 

<213> Hepatitis C virus 
<400> 51 

tcccaccggc agcggtaaga gcaccaaggt cccggctgcg tacgcagccc agggctacaa 60 
ggtgttggtg ctcaacccct ctgtt 85 



j <210> 52 

<211> 102 

<212> DNA 
J <213> Hepatitis C virus 

<400> 52 

cgaacgcccc tactgctggc actaccctcc aagaccttgt ggcattgtgc ccgcaaagag 60 
cgtgtgtggc ccggtatatt gcttcactcc cagccccgtg gt 102 



<210> 53 
<211> 95 
<212> DNA 

<213> Hepatitis C virus 
<400> 53 

tcccagcccc gtggtggtgg gaacgaccga caggtcgggc gcgcctacct acagctgggg 60 
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tgcaaatgat acggatgtct tcgtccttaa caaca 9 5 



<210> 54 
<211> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 54 

ggcgggagct cttgtagcat tcaagatcat gagcggtgag gtcccctcca cggaggacct 60 
ggtcaatctg ctgcccgcca tcctctcgcc tggagccctt gtagtcggtg tggtctgcgc 12 0 
agcaatactg cgccggcacg ttggcccggg cgagggggca gtgcaatgga tgaaccggct 180 
aatagccttc gcctcccggg ggaaccatgt ttcccccacg cactacgtgc cgga 234 



<210> 55 
<211> 442 
<212> DNA 

<213> Hepatitis C virus 
<400> 55 

tgaggtccag atcgtgtcaa ctgctaccca aaccttcctg gcaacgtgca tcaatggggt 60 
atgctggact gtctaccacg gggccggaac gaggaccatc gcatcaccca agggtcctgt 120 
catccagatg tataccaatg tggaccaaga ccttgtgggc tggcccgctc ctcaaggttc 180 
ccgctcattg acaccctgta cctgcggctc ctcggacctt tacctggtca cgaggcacgc 240 
cgatgtcatt cccgtgcgcc ggcgaggtga tagcaggggt agcctgcttt cgccccggcc 300 
catttcctac ttgaaaggct cctcgggggg tccgctgttg tgccccgcgg gacacgccgt 360 
gggcctattc agggccgcgg tgtgcacccg tggagtggct aaagcggtgg actttatccc 420 
tgtggagaac ctagggacaa cc 442 



<210> 56 
<211> 111 
<212> DNA 

<213> Hepatitis C virus 
<400> 56 

tgtaacccag ctcctgaggc gactgcatca gtggataagc tcggagtgta ccactccatg 60 
ctccggttcc tggctaaggg acatctggga ctggatatgc gaggtgctga g ill 



<210> 57 
<211> 87 
<212> DNA 

<213> Hepatitis C virus 
<400> 57 

cgtgtgtggc ccggtatatt gcttcactcc cagccccgtg gtggtgggaa cgaccgacag 60 
gtcgggcgcg cctacctaca gctgggg 8 7 



<210> 58 
<211> 137 
<212> DNA 
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<213> Hepatitis C virus 
<400> 58 

cccgcccttg cgagcttgga gacaccgggc 
aggaggcagg gctgccatat gtggcaagta 
caaactcact ccaatag 



ccggagcgtc cgcgctaggc ttctgtccag 60 
cctcttcaac tgggcagtaa gaacaaagct 12 0 

137 



<210> 59 
<211> 259 
<212> DNA 

<213> Hepatitis C virus 
<400> 59 

tactgccttt gtgggtgctg gcctagctgg 
ggtcctcgtg gacattcttg cagggtatgg 
caagatcatg agcggtgagg tcccctccac 
cctctcgcct ggagcccttg tagtcggtgt 
tggcccgggc gagggggca 



cgccgccatc ggcagcgttg gactggggaa 60 
cgcgggcgtg gcgggagctc ttgtagcatt 12 0 
ggaggacctg gtcaatctgc tgcccgccat 180 
ggtctgcgca gcaatactgc gccggcacgt 240 

259 



<210> 60 
<211> 130 
<212> DNA 

<213> Hepatitis C virus 



<400> 60 

tggcattgtg cccgcaaaga gcgtgtgtgg 
ggtggtggga acgaccgaca ggtcgggcgc 
ggatgtcttc 



cccggtatat tgcttcactc ccagccccgt 6 0 
gcctacctac agctggggtg caaatgatac 12 0 

130 



<210> 61 
<211> 191 
<212> DNA 

<213> Hepatitis C virus 
<400> 61 

ggtcctcgtg gacattcttg cagggtatgg 
caagatcatg agcggtgagg tcccctccac 
cctctcgcct ggagcccttg tagtcggtgt 
tggcccgggc g 



cgcgggcgtg gcgggagctc ttgtagcatt 60 
ggaggacctg gtcaatctgc tgcccgccat 12 0 
ggtctgcgca gcaatactgc gccggcacgt 180 

191 



<210> 62 
<211> 89 
<212> DNA 

<213> Hepatitis C virus 



<400> 62 

cgaacgcccc tactgctggc actaccctcc aagaccttgt ggcattgtgc ccgcaaagag 60 
cgtgtgtggc ccggtatatt gcttcactc 8 9 
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<210> 63 
<211> 230 
<212> DNA 

<213> Hepatitis C virus 
<400> 63 

caggcgccac tggacgacgc aagactgcaa ttgttctatc tatcccggcc atataacggg 60 
tcatcgcatg gcatgggata tgatgatgaa ctggtcccct acggcagcgt tggtggtagc 12 0 
tcagctgctc cggatcccac aagccatcat ggacatgatc gctggtgctc actggggagt 18 0 
cctggcgggc atagcgtatt tctccatggt ggggaactgg gcgaaggtcc 23 0 



<210> 64 
<211> 113 
<212> DNA 

<213> Hepatitis C virus 



<400> 64 

tgccatactc agcagcctca ctgtaaccca gctcctgagg cgactgcatc agtggataag 6 0 
ctcggagtgt accactccat gctccggttc ctggctaagg gacatctggg act 113 



<210> 65 
<211> 142 
<212> DNA 

<213> Hepatitis C virus 



<400> 65 

tgtctccagg actcaacgcc ggggcaggac 
tgtggcaccg ggggagcgcc cctccggcat 
tgacgcgggc tgtgcttggt at 



tggcaggggg aagccaggca tctatagatt 60 
gttcgactcg tccgtcctct gtgagtgcta 120 

142 



<210> 66 
<211> 162 
<212> DNA 

<213> Hepatitis C virus 



<400> 66 

ccttcctgcg ccgaactata agttcgcgct 
gataaggcgg gtgggggact tccactacgt 
cccgtgccag atcccatcgc ccgaattttt 



gtggagggtg tctgcagagg aatacgtgga 6 0 
atcgggtatg actactgaca atcttaaatg 12 0 
cacagaattg ga 162 



<210> 67 
<211> 337 
<212> DNA 

<213> Hepatitis C virus 
<400> 67 

cggagagatc cccttttacg gcaaggctat 
tctcatcttc tgccactcaa agaagaagtg 
gggcatcaat gccgtggcct actaccgcgg 
cgatgttgtc gtcgtgtcga ccgatgctct 



ccccctcgag gtgatcaagg ggggaagaca 60 
cgacgagctc gccgcgaagc tggtcgcatt 12 0 
tcttgacgtg tctgtcatcc cgaccagcgg 180 
catgactggc tttaccggcg acttcgactc 240 



21 



tgtgatagac tgcaacacgt gtgtcactca gacagtcgat ttcagccttg accctacctt 3 00 
taccattgag acaaccacgc tcccccagga tgctgtc 337 



<210> 68 
<211> 163 
<212> DNA 

<213> Hepatitis C virus 
<400> 68 

ggtctgcgca gcaatactgc gccggcacgt tggcccgggc gagggggcag tgcaatggat 60 
gaaccggcta atagccttcg cctcccgggg gaaccatgtt tcccccacgc actacgtgcc 12 0 
ggagagcgat gcagccgccc gcgtcactgc catactcagc age 163 



<210> 69 
<211> 309 
<212> DNA 

<213> Hepatitis C virus 
<400> 69 

ggccatcaag tccctcactg agaggcttta tgttgggggc cctcttacca attcaagggg 60 
ggaaaactgc ggctaccgca ggtgccgcgc gageggegta ctgacaacta gctgtggtaa 120 
caccctcact tgetacatea aggcccgggc agectgtega gccgcagggc tccaggactg 18 0 
caccatgctc gtgtgtggcg acgacttagt cgttatctgt gaaagtgcgg gggtccagga 240 
ggacgcggcg agectgagag ccttcacgga ggctatgacc aggtactccg ccccccccgg 3 00 
ggacccccc 3 09 



<210> 70 
<211> 240 
<212> DNA 

<213> Hepatitis C virus 
<400> 70 

actgeaagtt ctggacagcc attaccagga cgtgctcaag gaggtcaaag cagcggcgtc 60 
aaaagtgaag getaacttge tatcegtaga ggaagcttgc agcctgacgc ccccacattc 120 
agccaaatcc aagtttggct atggggcaaa agaegtcegt tgccatgcca gaaaggccgt 18 0 
agcccacatc aactccgtgt ggaaagacct tctggaagac agtgtaacac caatagacac 240 



<210> 71 
<211> 184 
<212> DNA 

<213> Hepatitis C virus 
<400> 71 

cactcagaca gtcgatttca gccttgaccc tacctttacc attgagacaa ccacgctccc 60 
ecaggatget gtctccagga ctcaacgccg gggcaggact ggcaggggga agecaggcat 120 
ctatagattt gtggcaccgg gggagcgccc ctccggcatg ttcgactcgt ccgtcctctg 180 
tgag 184 



<210> 72 
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<211> 234 
<212> DNA 

<213> Hepatitis C virus 
<400> 72 

agttctggac agccattacc aggacgtgct caaggaggtc aaagcagcgg cgtcaaaagt 60 
gaaggctaac ttgctatccg tagaggaagc ttgcagcctg acgcccccac attcagccaa 120 
atccaagttt ggctatgggg caaaagacgt ccgttgccat gccagaaagg ccgtagccca 18 0 
catcaactcc gtgtggaaag accttctgga agacagtgta acaccaatag acac 234 



<210> 73 
<211> 80 
<212> DNA 

<213> Hepatitis C virus 
<400> 73 

ctaccctcca agaccttgtg gcattgtgcc cgcaaagagc gtgtgtggcc cggtatattg 60 
cttcactccc agccccgtgg 8 0 

<210> 74 
<211> 112 
<212> DNA 

<213> Hepatitis C virus 
<400> 74 

tcctatcagt tatgccaacg gaagcggcct cgacgaacgc ccctactgct ggcactaccc 60 
tccaagacct tgtggcattg tgcccgcaaa gagcgtgtgt ggcccggtat at 112 

<210> 75 
<211> 107 
<212> DNA 

<213> Hepatitis C virus 
<400> 75 

cactgtaacc cagctcctga ggcgactgca tcagtggata agctcggagt gtaccactcc 60 
atgctccggt tcctggctaa gggacatctg ggactggata tgcgagg 107 

<210> 76 
<211> 78 
<212> DNA 

<213> Hepatitis C virus 
<400> 76 

gctcctgagg cgactgcatc agtggataag ctcggagtgt accactccat gctccggttc 6 0 
ctggctaagg gacatctg 7 8 

<210> 77 
<211> 103 
<212> PRT 
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<213> Hepatitis C virus 
<400> 77 

Ala Cys Glu Cys Pro Gly Arg Ser Arg Arg Pro Cys Thr Met Ser Thr 
15 10 IS 

Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro 
20 25 30 

Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr 
35 40 45 

Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys 
50 55 60 

Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys 
65 70 75 80 

Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp 

85 90 95 

Pro Leu Tyr Gly Asn Glu Gly 
100 



<210> 78 
<211> 113 
<212> PRT 

<213> Hepatitis C virus 
<400> 78 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg 
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<210> 79 
<211> 114 
<2X2> PRT 

<213> Hepatitis C virus 
<400> 79 

Ala lie Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn 
15 10 15 

Ala Ser Arg Cys Trp Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp 
20 25 30 

Gly Lys Leu Pro Thr Thr Gin Leu Arg Arg His lie Asp Leu Leu Val 
35 40 45 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly 
50 55 60 

Ser Val Phe Leu Val Gly Gin Leu Phe Thr Phe Ser Pro Arg Arg His 
65 70 75 80 

Trp Thr Thr Gin Asp Cys Asn Cys Ser lie Tyr Pro Gly His lie Thr 

85 90 95 

Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Ala 
100 105 110 

Ala Leu 



<210> 80 
<211> 91 
<212> PRT 

<213> Hepatitis C virus 
<400> 80 

Gly Val Asp Ala Glu Thr His Val Thr Gly Gly Asn Ala Gly Arg Thr 
15 10 15 

Thr Ala Gly Leu Val Gly Leu Leu Thr Pro Gly Ala Lys Gin Asn lie 
20 25 30 

Gin Leu lie Asn Thr Asn Gly Ser Trp His lie Asn Ser Thr Ala Leu 
35 40 45 

Asn Cys Asn Glu Ser Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr 
50 55 60 



25 



Gin His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 
65 70 75 80 

Arg Arg Leu Thr Asp Phe Ala Gin Gly Trp Gly 

85 90 



<210> 81 
<211> 176 
<212> PRT 

<213> Hepatitis C virus 
<400> 81 

Trp Gly Pro lie Ser Tyr Ala Asn Gly Ser Gly Leu Asp Glu Arg Pro 
15 10 15 

Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly lie Val Pro Ala Lys 
20 25 30 

Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 
35 40 45 

Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Ala Asn 
50 55 60 

Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 
65 70 75 80 

Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 
85 90 95 

Gly Ala Pro Pro Cys Val lie Gly Gly Val Gly Asn Asn Thr Leu Leu 
100 105 110 

Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ser Arg 
115 120 125 

Cys Gly Ser Gly Pro Trp lie Thr Pro Arg Cys Met Val Asp Tyr Pro 
130 135 140 

Tyr Arg Leu Trp His Tyr Pro Cys Thr He Asn Tyr Thr He Phe Lys 
145 150 155 160 

Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys 
165 170 175 



<210> 82 
<211> 96 
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<212> PRT 

<213> Hepatitis C virus 
<400> 82 

Trp His Tyr Pro Pro Arg Pro Cys Gly lie Val Pro Ala Lys Ser Val 
15 10 15 

Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr 
20 25 30 

Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Ala Asn Asp Thr 
35 40 45 

Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn Trp Phe 
50 55 60 

Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys Gly Ala 
65 70 75 80 

Pro Pro Cys Val lie Gly Gly Val Gly Asn Asn Thr Leu Leu Cys Pro 

85 90 95 



<210> 83 
<211> 278 
<212> PRT 
<213> Hepatitis 

<400> 83 
Ala Ala Cys Gly 
1 

Gly Gin Glu lie 
20 

Trp Arg Leu Leu 
35 

Leu Leu Gly Cys 
50 

Val Glu Gly Glu 
65 

Ala Thr Cys lie 



Thr Arg Thr lie 
100 



C virus 



Asp He He Asn 
5 

Leu Leu Gly Pro 



Ala Pro He Thr 
40 

He He Thr Ser 
55 

Val Gin He Val 
70 

Asn Gly Val Cys 
85 

Ala Ser Pro Lys 



Gly Leu Pro Val 
10 

Ala Asp Gly Met 
25 

Ala Tyr Ala Gin 



Leu Thr Gly Arg 
60 

Ser Thr Ala Thr 
75 

Trp Thr Val Tyr 
90 

Gly Pro Val He 
105 



Ser Ala Arg Arg 
15 

Val Ser Lys Gly 
30 

Gin Thr Arg Gly 
45 

Asp Lys Asn Gin 



Gin Thr Phe Leu 
80 

His Gly Ala Gly 
95 

Gin Met Tyr Thr 
110 



mmmm 
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Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg 
115 120 125 

Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr 
130 135 140 

Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly 
145 150 155 ISO 

Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly 
165 170 175 

Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala 
180 185 190 

Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val 
195 200 205 

Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser 
210 215 220 

Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala 
225 230 235 240 

Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala 
245 250 255 

Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu 
260 265 270 

Gly Phe Gly Ala Tyr Met 
275 



<210> 84 
<211> 158 
<212> PRT 

<213> Hepatitis C virus 
<400> 84 

Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He 
15 10 15 

Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly 
20 25 30 

His Ala Val Gly Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala 
35 40 45 

Lys Ala Val Asp Phe He Pro Val Glu Asn Leu Gly Thr Thr Met Arg 
50 55 60 

Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser 
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65 



70 



75 



80 



Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr 
85 90 95 

Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu 
100 105 110 

Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys 
115 120 125 

Ala His Gly Val Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr 
130 135 140 

Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu 
145 150 155 



<210> 85 
<211> 263 
<212> PRT 

<213> Hepatitis C virus 
<400> 85 

Asp Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys 
15 10 15 

Gly Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly 
20 25 30 

Leu Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp 
35 40 45 

Phe lie Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe 
50 55 60 

Thr Asp Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala 
65 70 75 80 

His Leu His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala 

85 90 95 

Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val 
100 105 110 

Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val 
115 120 125 



Asp Pro Asn lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro 
130 135 140 



lie Thr Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser 
145 150 155 160 
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Gly Gly Ala Tyr 



Ala Thr Ser lie 
180 

Ala Gly Ala Arg 
195 

Val Thr Val Ser 
210 

Gly Glu He Pro 
225 

Gly Gly Arg His 



Leu Ala Ala Lys 
260 



Asp He He He 
165 

Leu Gly He Gly 



Leu Val Val Leu 

200 

His Pro Asn He 
215 

Phe Tyr Gly Lys 
230 

Leu He Phe Cys 
245 

Leu Val Ala 



Cys Asp Glu Cys 
170 

Thr Val Leu Asp 
185 

Ala Thr Ala Thr 



Glu Glu Val Ala 
220 

Ala He Pro Leu 
235 

His Ser Lys Lys 
250 



His Ser Thr Asp 
175 

Gin Ala Glu Thr 
190 

Pro Pro Gly Ser 
205 

Leu Ser Thr Thr 



Glu Val He Lys 
240 

Lys Cys Asp Glu 
255 



<210> 86 
<211> 194 
<212> PRT 
<213> Hepatitis 

<400> 86 
Asp Asn Ser Ser 
1 

Leu His Ala Pro 
20 

Tyr Ala Ala Gin 
35 

Ala Thr Leu Gly 
50 

Pro Asn He Arg 
65 

Thr Tyr Ser Thr 



Gly Ala Tyr Asp 
100 

Thr Ser He Leu 
115 



C virus 



Pro Pro Ala Val 
5 

Thr Gly Ser Gly 



Gly Tyr Lys Val 
40 

Phe Gly Ala Tyr 
55 

Thr Gly Val Arg 
70 

Tyr Gly Lys Phe 
85 

He He He Cys 



Gly He Gly Thr 
120 



Pro Gin Ser Phe 
10 

Lys Ser Thr Lys 
25 

Leu Val Leu Asn 



Met Ser Lys Ala 
60 

Thr He Thr Thr 
75 

Leu Ala Asp Gly 
90 

Asp Glu Cys His 
105 

Val Leu Asp Gin 



Gin Val Ala His 
15 

Val Pro Ala Ala 
30 

Pro Ser Val Ala 
45 

His Gly Val Asp 



Gly Ser Pro He 
80 

Gly Cys Ser Gly 
95 

Ser Thr Asp Ala 
110 

Ala Glu Thr Ala 
125 
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Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val 
130 135 140 

Thr Val Ser His Pro Asn lie Glu Glu Val Ala Leu Ser Thr Thr Gly 
145 150 155 160 

Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val lie Lys Gly 
165 170 175 

Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu 
180 185 190 

Ala Ala 



<210> 87 
<211> 205 
<212> PRT 

<213> Hepatitis C virus 
<400> 87 

Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie lie Cys 
15 10 15 

Asp Glu Cys His Ser Thr Asp Ala Thr Ser lie Leu Gly He Gly Thr 
20 25 30 

Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala 
35 40 45 

Thr Ala Thr Pro Pro Gly Ser Val Thr Val Ser His Pro Asn He Glu 
50 55 60 

Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala 
65 70 75 80 

He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys His 

85 90 95 

Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly 
100 105 110 

He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro 
115 120 125 

Thr Ser Gly Asp Val Val Val Val Ser Thr Asp Ala Leu Met Thr Gly 
130 135 140 

Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr 
145 150 155 160 

Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr 
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165 

Thr Leu Pro Gin Asp Ala Val Ser 
180 

Gly Arg Gly Lys Pro Gly lie Tyr 
195 200 



170 175 

Arg Thr Gin Arg Arg Gly Arg Thr 
185 190 

Arg Phe Val Ala Pro 
205 



<210> 88 
<211> 186 
<212> PRT 

<213> Hepatitis C virus 
<400> 88 

Ser Thr Asp Ala Thr Ser lie Leu Gly lie Gly Thr Val Leu Asp Gin 
15 10 15 

Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro 
20 25 30 

Pro Gly Ser Val Thr Val Ser His Pro Asn lie Glu Glu Val Ala Leu 
35 40 45 

Ser Thr Thr Gly Glu lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu 
50 55 60 

Val lie Lys Gly Gly Arg His Leu lie Phe Cys His Ser Lys Lys Lys 
65 70 75 80 

Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val 

85 90 95 

Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He Pro Thr Ser Gly Asp 
100 105 110 

Val Val Val Val Ser Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp 
115 120 125 

Phe Asp Ser Val He Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp 
130 135 140 

Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin 
145 150 155 160 

Asp Ala Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys 
165 170 175 

Pro Gly lie Tyr Arg Phe Val Ala Pro Gly 
180 185 
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<210> 89 
<211> 158 
<212> PRT 
<213> Hepatitis 

<400> 89 
Val lie Asp Cys 
1 

Asp Pro Thr Phe 
20 

Ser Arg Thr Gin 
35 

Tyr Arg Phe Val 
50 

Ser Val Leu Cys 
65 

Thr Pro Ala Glu 



Gly Leu Pro Val 
100 

Thr Gly Leu Thr 
115 

Ser Gly Glu Asn 
13 0 

Ala Arg Ala Gin 
145 



C virus 



Asn Thr Cys Val 
5 

Thr lie Glu Thr 



Arg Arg Gly Arg 
40 

Ala Pro Gly Glu 
55 

Glu Cys Tyr Asp 
70 

Thr Thr Val Arg 
85 

Cys Gin Asp His 



His lie Asp Ala 
120 

Phe Pro Tyr Leu 
135 

Ala Pro Pro Pro 
150 



Thr Gin Thr Val 
10 

Thr Thr Leu Pro 
25 

Thr Gly Arg Gly 



Arg Pro Ser Gly 
60 

Ala Gly Cys Ala 
75 

Leu Arg Ala Tyr 
90 

Leu Glu Phe Trp 
105 

His Phe Leu Ser 



Val Ala Tyr Gin 
140 

Ser Trp Asp Gin 
155 



Asp Phe Ser Leu 
15 

Gin Asp Ala Val 
30 

Lys Pro Gly lie 
45 

Met Phe Asp Ser 



Trp Tyr Glu Leu 
80 

Met Asn Thr Pro 
95 

Glu Gly Val Phe 
110 

Gin Thr Lys Gin 
125 

Ala Thr Val Cys 



Met Trp 



<210> 90 
<211> 129 
<212> PRT 

<213> Hepatitis C virus 
<400> 90 

Arg Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser 
15 10 15 

Val Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr 
20 25 30 

Pro Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly 
35 40 45 

Leu Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr 




Gly Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser 
65 70 75 80 



Gly Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala 
85 90 95 

Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu 
100 105 110 

lie Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg 

115 120 125 



Leu 



<210> 91 
<211> 51 
<212> PRT 

<213> Hepatitis C virus 
<400> 91 

Thr Ser Thr Trp Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala 
15 10 15 

Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val Gly Arg* lie Val Leu 
20 25 30 

Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Gin Glu 
35 40 45 

Phe Asp Glu 
50 



<210> 92 
<211> 18 
<212> PRT 

<213> Hepatitis C virus 
<400> 92 

Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val Val lie Val 
15 10 15 

Gly Arg 



<210> 93 
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<21L> 208 
<212> PRT 

<213> Hepatitis C virus 
<400> 93 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu Leu 
15 10 15 

Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
20 25 30 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 
35 40 45 

Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly 
50 55 60 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 
65 70 75 80 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 
85 90 95 

Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg 
100 105 110 

His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 
115 120 125 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
130 135 140 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr 
145 150 155 160 

Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
165 170 175 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
180 185 190 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 
195 200 205 



<210> 94 
<211> 207 
<212> PRT 

<213> Hepatitis C virus 



n 
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35 



. <400> 94 

Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val 
15 10 15 

Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly 
20 25 30 

Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro 
35 40 45 

Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro Gly 
50 55 60 

Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val 
65 70 75 80 

Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe 
85 90 95 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
100 105 110 

Asp Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr Val Thr 
115 120 125 

Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr 
130 135 140 

Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He Cys Glu 
145 150 155 160 

Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin 
165 170 175 

Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val 
180 185 190 

Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly Ala 
195 200 205 



<210> 95 
<211> 225 
<212> PRT 

<213> Hepatitis C virus 
<400> 95 

Leu Val Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu 
15 10 15 

Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu 
20 25 30 



B9I 
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Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala, Leu Val Val Gly 
35 40 45 

Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly Glu Gly 
50 55 60 

Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn 
65 70 75 80 

His Val Ser Pro Thr His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg 

85 90 95 

Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg 
100 105 110 

Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser 
115 120 125 

Trp Leu Arg Asp He Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe 
130 135 140 

Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro 
145 150 155 160 

Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly 
165 170 175 

He Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val 
180 185 190 

Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met 
195 , 200 205 

Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr 
210 215 220 

Pro 
225 



<210> 96 
<211> 145 
<212> PRT 

<213> Hepatitis C virus 
<400> 96 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
15 10 15 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
20 25 30 

Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
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mamaa 



35 " 40 45 

lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
50 55 60 

Asn Arg Leu lie Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
65 70 75 80 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie Leu 

85 90 95 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie 
100 105 110 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp lie 
115 120 125 

Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
130 135 140 

Ala 
145 



<210> 97 
<211> 54 
<212> PRT 

<213> Hepatitis C virus 
<400> 97 

Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val 
15 10 15 

Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe 
20 25 30 

Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu Ser 
35 40 45 

Asp Ala Ala Ala Arg Val 
50 



<210> 98 
<211> 165 
<212> PRT 

<213> Hepatitis C virus 
<400> 98 

Ala Ser Arg Gly Asn His Val Ser Pro Thr Kis Tyr Val Pro Glu Ser 
15 10 15 
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Asp Ala Ala Ala Arg Val Tlir Ala lie Leu Ser Ser Leu Thr Val Thr 
20 25 30 

Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr 
35 40 45 

Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp Asp Trp lie Cys Glu 
50 55 60 

Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin 
65 70 75 80 

Leu Pro Gly lie Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg Gly Val 

85 90 95 

Trp Arg Gly Asp Gly lie Met His Thr Arg Cys His Cys Gly Ala Glu 
100 105 110 

lie Thr Gly His Val Lys Asn Gly Thr Met Arg lie Val Gly Pro Arg 
115 120 125 

Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro lie Asn Ala Tyr Thr 
130 135 140 

Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe Ala Leu 
145 150 155 160 

Trp Arg Val Ser Ala 
165 



<210> 99 
<211> 308 
<212> PRT 

<213> Hepatitis C virus 
<400> 99 

Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie Leu Ser 
15 10 15 

Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser 
20 25 30 

Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp 
35 40 45 

Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala 
50 55 60 

Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg 
65 70 75 80 

Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr Arg Cys 



^3-08-2®» 



85 



90 



95 



His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr Met Arg 
100 105 110 

lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro 
115 120 125 

lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn 
130 135 140 

Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu lie 
145 150 155 160 

Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn 
165 170 175 

Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu Phe Phe Thr Glu Leu 
180 185 190 

Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu 
195 200 205 

Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly 
210 215 220 

Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser 
225 230 235 240 

Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg Arg 
245 250 255 

Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser Gin 
260 265 270 

Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser 
275 280 285 

Pro Asp Ala Glu Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met 
290 295 300 

Gly Gly Asn lie 
305 



<210> 100 
<211> 283 
<212> PRT 

<213> Hepatitis C virus 



<400> 100 

Leu Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp 
15 10 15 



40 



lie Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp 
20 25 30 

lie Trp Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu 
35 40 45 

Lys Ala Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe Val Ser Cys 
50 55 60 

Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr 
65 70 75 80 

Arg Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr 

85 90 95 

Met Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr 
100 105 110 

Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala 
115 120 125 

Pro Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val 
130 135 140 

Glu He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr 
145 150 155 160 

Asp Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr 
165 170 175 

Glu Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro 
180 185 190 

Leu Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro 
195 200 205 

Val Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu 
210 215 220 

Thr Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly 
225 230 235 240 

Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala 
245 250 255 

Ser Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His 
260 265 270 



Asp Ser Pro Asp Ala Glu Leu lie Glu Ala Asn 
275 280 
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<210> 101 
<211> 249 
<212> PRT 

<213> Hepatitis C virus 
<400> 101 

Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp lie Ser 
15 10 15 

Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp lie Trp 
20 25 30 

Asp Trp lie Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala 
35 40 45 

Lys Leu Met Pro Gin Leu Pro Gly lie Pro Phe Val Ser Cys Gin Arg 
50 55 60 

Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly lie Met His Thr Arg Cys 
65 70 75 80 

His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr Met Arg 
85 90 95 

lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro 
100 . 105 110 

lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn 
115 120 125 

Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu lie 
130 135 140 

Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn 
145 150 155 160 

Leu Lys Cys Pro Cys Gin lie Pro Ser Pro Glu Phe Phe Thr Glu Leu 
165 170 175 

Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu 
180 185 190 

Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly 
195 200 205 

Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser 
210 215 220 

Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg Arg 
225 230 235 240 



Leu Ala Arg Gly Ser Pro Pro Ser Met 
245 
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<210> 102 
<211> 85 
<212> PRT 

<213> Hepatitis C virus 
<400> 102 

Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe 
15 10 15 

Val Ser Cys Gin Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He 
20 25 30 

Met His Thr Arg Cys His Cys Gly Ala Glu He Thr Gly His Val Lys 
35 40 45 

Asn Gly Thr Met Arg lie Val Gly Pro Arg Thr Cys Arg Asn Met Trp 
50 55 60 

Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro 
65 70 75 80 

Leu Pro Ala Pro Asn 

85 



<210> 103 
<211> 94 
<212> PRT 

<213> Hepatitis C virus 
<400> 103 

Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro 
15 10 15 

Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr 
20 25 30 

Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe Ala 
35 40 45 

Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val Gly 
50 55 60 

Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys Pro 
65 70 75 80 



Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly 
85 90 



43 



<210> 104 
<211> 75 
<212> PRT 

<213> Hepatitis C virus 
<400> 104 

lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr 
15 10 15 

Arg Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro 
20 25 30 

Leu Val Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu lie 
35 40 45 

Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala Leu Pro Val Trp Ala Arg 
50 55 60 

Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp 
65 70 75 



<210> 105 
<211> 90 
<212> PRT 

<213> Hepatitis C virus 
<400> 105 

His Gly Cys Pro Leu Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro 
15 10 15 

Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala 
20 25 30 

Leu Ala Glu Leu Ala Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly 
35 40 45 

lie Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly 
50 55 60 

Cys Pro Pro Asp Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu 
65 70 75 80 

Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 

85 90 



<210> 106 
<211> 137 
<212> PRT 

<213> Hepatitis C virus 
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<400> 106 
Ser Trp Thr Gly 
1 

Leu Pro lie Asn 
20 

Val Tyr Ser Thr 
35 

Thr Phe Asp Arg 
50 

Lys Glu Val Lys 
65 

Val Glu Glu Ala 



Phe Gly Tyr Gly 
100 

Ala His lie Asn 
115 

Pro lie Asp Thr 
130 



Ala Leu Val Thr 

5 

Ala Leu Ser Asn 



Thr Ser Arg Ser 
40 

Leu Gin Val Leu 
55 

Ala Ala Ala Ser 
70 

Cys Ser Leu Thr 
85 

Ala Lys Asp Val 



Ser Val Trp Lys 
120 

Thr lie Met Ala 
135 



Pro Cys Ala Ala 
10 

Ser Leu Leu Arg 
25 

Ala Cys Gin Arg 



Asp Ser His Tyr 
60 

Lys Val Lys Ala 
75 

Pro Pro His Ser 
90 

Arg Cys His Ala 
105 

Asp Leu Leu Glu 



Lys 



Glu Glu Gin Lys 
15 

His His Asn Leu 
30 

Gin Lys Lys Val 
45 

Gin Asp Val Leu 



Asn Leu Leu Ser 
80 

Ala Lys Ser Lys 
95 

Arg Lys Ala Val 
110 

Asp Ser Val Thr 
125 



<210> 107 
<211> 300 
<212> PRT 

<213> Hepatitis C virus 
<400> 107 

Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr Pro Pro His 
15 10 15 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Cys His 
20 25 30 

Ala Arg Lys Ala Val Ala His lie Asn Ser Val Trp Lys Asp Leu Leu 
35 40 45 

Glu Asp Ser Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn Glu 
50 55 60 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 
65 70 75 80 

lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

85 90 95 
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Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly Ser Ser Tyr 
100 105 110 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Gin Ala 
115 120 125 

Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 
130 135 140 

Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu Ala lie 
145 150 155 160 

Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie Lys Ser 
165 170 175 

Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser Arg Gly 
180 185 190 

Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
195 200 205 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala Ala Cys 
210 215 220 

Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly Asp Asp 
225 230 235 240 

Leu Val Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser 
245 250 255 

Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 
260 265 270 

Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
275 280 285 

Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys 
290 295 300 



<210> 108 
<211> 199 
<212> PRT 

<213> Hepatitis C virus 
<400> 108 

Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 
15 10 15 

Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala 
20 25 30 

His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro 




lllljgil 

46 



35 40 45 

lie Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 
50 55 60 

Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 
65 70 75 80 

Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys 

85 90 95 

Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 
100 105 110 

Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr 
115 120 125 

Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 
130 135 140 

Glu Ser Asp lie Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 
145 150 155 160 

Asp Pro Gin Ala Arg Val Ala lie Lys Ser Leu Thr Glu Arg Leu Tyr 
165 170 175 

Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg 
180 185 190 

Arg c y s Arg Ala Ser Gly Val 
195 



<210> 109 
<211> 260 
<212> PRT 

<213> Hepatitis C virus 
<400> 109 

Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr lie Met Ala Lys 
15 10 15 

Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala 
20 25 30 

Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 
35 40 45 

Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly Ser 
50 55 60 

Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val 
65 70 75 80 
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Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp Thr 
85 90 95 

Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp lie Arg Thr Glu Glu 
100 105 110 

Ala lie Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala lie 
115 120 125 

Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn Ser 
130 135 140 

Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu 
145 150 155 160 

Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr lie Lys Ala Arg Ala 
165 170 175 

Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys Gly 
180 185 190 

Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala 
195 200 205 

Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro 
210 215 220 

Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser 
225 230 235 240 

Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val 
245 250 255 



Tyr Tyr Leu Thr 
260 



<210> 110 
<211> 127 
<212> PRT 

<213> Hepatitis C virus 
<400> 110 

Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp Ala Ala Ser Leu Arg 
15 10 15 

Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly Asp Pro 
20 25 30 



Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn 
35 40 45 
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Val Ser Val Ala His Asp Gly Ala 
50 55 

Arg Asp Pro Thr Thr Pro Leu Ala 
65 70 

His Thr Pro Val Asn Ser Trp Leu 
85 

Thr Leu Trp Ala Arg Met lie Leu 
100 

lie Ala Arg Asp Gin Leu Glu Gin 
115 120 



Gly Lys Arg Val Tyr Tyr Leu Thr 
60 

Arg Ala Ala Trp Glu Thr Ala Arg 
75 80 

Gly Asn lie lie Met Phe Ala Pro 
90 95 

Met Thr His Phe Phe Ser Val Leu 
105 110 

Ala Leu Asn Cys Glu lie Tyr 
125 



<210> 111 
<211> 89 
<212> PRT 

<213> Hepatitis C virus 
<400> 111 

Val Ser Val Ala His Asp Gly Ala Gly Lys Arg Val Tyr Tyr Leu Thr 
15 10 15 

Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr Ala Arg 
20 25 30 

His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Phe Ala Pro 
35 40 45 

Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser Val Leu 
50 55 60 

lie Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn Cys Glu He Tyr Gly 
65 70 75 80 

Ala Cys Tyr Ser lie Glu Pro Leu Asp 

85 



<210> 112 
<211> 73 
<212> PRT 

<213> Hepatitis C virus 
<400> 112 

Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu 
15 10 15 

lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu 
20 25 30 
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Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ser 
35 40 45 

Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu Phe Asn Trp Ala 
50 55 60 

Val Arg Thr Lys Leu Lys Leu Thr Pro 
65 70 



<210> 113 
<211> 63 
<212> PRT 

<213> Hepatitis C virus 
<400> 113 

Ser Pro Gly Glu lie Asn Arg Val Ala Ala Cys Leu Arg Lys Leu Gly 
15 10 15 

Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg Ala 
20 25 30 

Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie Cys Gly Lys Tyr Leu 
35 40 45 

Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro lie Ala 
50 55 60 



<210> 114 
<211> 310 
<212> DNA 

<213> Hepatitis C virus 
<400> 114 

tgcttgcgag tgccccggga ggtctcgtag 
tcaaagaaaa accaaacgta acaccaaccg 
cggtcagatc gttggtggag tttacttgtt 
cgcgacgagg aagacttccg agcggtcgca 
ggcacgtcgg cccgagggca ggacctgggc 
caatgagggt 



accgtgcacc atgagcacga atcctaaacc 60 
tcgcccacag gacgtcaagt tcccgggtgg 120 
gccgcgcagg ggccctagat tgggtgtgcg 180 
acctcgaggt agacgtcagc ctatccccaa 240 
tcagcccggg tacccttggc ccctctatgg 3 00 

310 



<210> 115 
<211> 339 
<212> DNA 

<213> Hepatitis C virus 
<400> 115 

atgagcacga atcctaaacc tcaaagaaaa accaaacgta acaccaaccg tcgcccacag 60 
gacgtcaagt tcccgggtgg cggtcagatc gttggtggag tttacttgtt gccgcgcagg 120 



4© 




ggccctagat tgggtgtgcg cgcgacgagg 
agacgtcagc ctatccccaa ggcacgtcgg 
tacccttggc ccctctatgg caatgagggt 
cgtggctctc ggcctagctg gggccccaca 



aagacttccg agcggtcgca acctcgaggt 180 
cccgagggca ggacctgggc tcagcccggg 240 
tgcgggtggg cgggatggct cctgtctccc 300 
gacccccgg 339 



<210> 116 
<211> 345 
<212> DNA 

<213> Hepatitis C virus 
<400> 116 

tgccatcctg cacactccgg ggtgtgtccc ttgcgttcgc gagggtaacg cctcgaggtg 60 
ttgggtggcg gtgaccccca cggtggccac cagggacggc aaactcccca caacgcagct 12 0 
tcgacgtcat atcgatctgc ttgtcgggag cgccaccctc tgctcggccc tctacgtggg 180 
ggacctgtgc gggtctgtct ttcttgttgg tcaactgttt accttctctc ccaggcgcca . 240 
ctggacgacg caagactgca attgttctat ctatcccggc catataacgg gtcatcgcat 300 
ggcatgggat atgatgatga actggtcccc tacggcagcg ttggt 345 



<210> 117 
<211> 276 
<212> DNA 

<213> Hepatitis C virus 
<400> 117 

cggcgtcgac gcggaaaccc acgtcaccgg 
tgttggtctc cttacaccag gcgccaagca 
ttggcacatc aatagcacgg ccttgaattg 
agggctcttc tatcaacaca aattcaactc 
ccgacgcctt accgattttg cccagggctg 



gggaaatgcc ggccgcacca cggctgggct 60 
gaacatccaa ctgatcaaca ccaacggcag 120 
caatgaaagc cttaacaccg gctggttagc 180 
ttcaggctgt cctgagaggt tggccagctg 24 0 
gggtcc 276 



<210> 118 
<211> 531 
<212> DNA 

<213> Hepatitis C virus 
<400> 118 

ctggggtcct atcagttatg ccaacggaag 
ctaccctcca agaccttgtg gcattgtgcc 
cttcactccc agccccgtgg tggtgggaac 
ctggggtgca aatgatacgg atgtcttcgt 
ttggttcggt tgtacctgga tgaactcaac 
ttgtgtcatc ggaggggtgg gcaacaacac 
acatccggaa gccacatact ctcggtgcgg 
ggtcgactac ccgtataggc tttggcacta 
agtcaggatg tacgtgggag gggtcgagca 



cggcctcgac gaacgcccct actgctggca 60 
cgcaaagagc gtgtgtggcc cggtatattg 12 0 
gaccgacagg tcgggcgcgc ctacctacag 180 
ccttaacaac accaggccac cgctgggcaa 24 0 
tggattcacc aaagtgtgcg gagcgccccc 300 
cttgctctgc cccactgatt gcttccgcaa 360 
ctccggtccc tggattacac ccaggtgcat 42 0 
tccttgtacc atcaattaca ccatattcaa 48 0 
caggctggaa gcggcctgca a 531 



<210> 119 
<211> 289 
<212> DNA 

<213> Hepatitis C virus 



51 



<400> 119 

ctggcactac cctccaagac cttgtggcat 
atattgcttc actcccagcc ccgtggtggt 
ctacagctgg ggtgcaaatg atacggatgt 
gggcaattgg ttcggttgta cctggatgaa 
gcccccttgt gtcatcggag gggtgggcaa 



tgtgcccgca aagagcgtgt gtggcccggt 60 
gggaacgacc gacaggtcgg gcgcgcctac 12 0 
cttcgtcctt aacaacacca ggccaccgcfc 180 
ctcaactgga ttcaccaaag tgtgcggagc 240 
caacaccttg ctctgcccc 28 9 



<210> 120 
<211> 836 
<212> DNA 

<213> Hepatitis C virus 
<400> 120 

gccgcgtgcg gtgacatcat caacggcttg cccgtctctg cccgtagggg ccaggagata 60 
ctgcttgggc cagccgacgg aatggtctcc aaggggtgga ggttgctggc gcccatcacg 120 
gcgtacgccc agcagacgag aggcctccta gggtgtataa tcaccagcct gactggccgg 180 
gacaaaaacc aagtggaggg tgaggtccag atcgtgtcaa ctgctaccca aaccttcctg 240 
gcaacgtgca tcaatggggt atgctggact gtctaccacg gggccggaac gaggaccatc 3 00 
gcatcaccca agggtcctgt catccagatg tataccaatg tggaccaaga ccttgtgggc 360 
tggcccgctc ctcaaggttc ccgctcattg acaccctgta cctgcggctc ctcggacctt 420 
tacctggtca cgaggcacgc cgatgtcatt cccgtgcgcc ggcgaggtga tagcaggggt 480 
agcctgcttt cgccccggcc catttcctac ttgaaaggct cctcgggggg tccgctgttg 540 
tgccccgcgg gacacgccgt gggcctattc agggccgcgg tgtgcacccg tggagtggct 600 
aaagcggtgg actttatccc tgtggagaac ctagggacaa ccatgagatc cccggtgttc 660 
acggacaact cctctccacc agcagtgccc cagagcttcc aggtggccca cctgcatgct 720 
cccaccggca gcggtaagag caccaaggtc ccggctgcgt acgcagccca gggctacaag 780 
gtgttggtgc tcaacccctc tgttgctgca acgctgggct ttggtgctta catgtc 836 



<210> 121 
<211> 475 
<212> DKFA 

<213> Hepatitis C virus 
<400> 121 

gcgccggcga ggtgatagca ggggtagcct 
aggctcctcg gggggtccgc tgttgtgccc 
cgcggtgtgc acccgtggag tggctaaagc 
gacaaccatg agatccccgg tgttcacgga 
cttccaggtg gcccacctgc atgctcccac 
tgcgtacgca gcccagggct acaaggtgtt 
gggctttggt gcttacatgt ccaaggccca 
gagaacaatt accactggca gccccatcac 



gctttcgccc cggcccattt cctacttgaa 60 
cgcgggacac gccgtgggcc tattcagggc 120 
ggtggacttt atccctgtgg agaacctagg 180 
caactcctct ccaccagcag tgccccagag 240 
cggcagcggt aagagcacca aggtcccggc 3 00 
ggtgctcaac ccctctgttg ctgcaacgct 360 
tggggttgat cctaatatca ggaccggggt 420 
gtactccacc tacggcaagt tcctt 475 



<210> 122 
<211> 790 
<212> DNA 

<213> Hepatitis C virus 



<400> 122 

tgatagcagg ggtagcctgc tttcgccccg gcccatttcc tacttgaaag gctcctcggg 60 




gggtccgctg ttgtgccccg cgggacacgc cgtgggccta ttcagggccg cggtgtgcac 120 
ccgtggagtg gctaaagcgg tggactttat ccctgtggag aacctaggga caaccatgag 180 
atccccggtg ttcacggaca actcctctcc accagcagtg ccccagagct tccaggtggc 240 
ccacctgcat gctcccaccg gcagcggtaa gagcaccaag gtcccggctg cgtacgcagc 3 00 
ccagggctac aaggtgttgg tgctcaaccc ctctgttgct gcaacgctgg gctttggtgc 360 
ttacatgtcc aaggcccatg gggttgatcc taatatcagg accggggtga gaacaattac 420 
cactggcagc cccatcacgt actccaccta cggcaagttc cttgccgacg gcgggtgctc 480 
aggaggtgct tatgacataa taatttgtga cgagtgccac tccacggatg ccacatccat 540 
cttgggcatc ggcactgtcc ttgaccaagc agagactgcg ggggcgagac tggttgtgct 600 
cgccactgct acccctccgg gctccgtcac tgtgtcccat cctaacatcg aggaggttgc 660 
tctgtccacc accggagaga tcccctttta cggcaaggct atccccctcg aggtgatcaa 720 
ggggggaaga catctcatct tctgccactc aaagaagaag tgcgacgagc tcgccgcgaa 7 80 
gctggt cgca 790 



<210> 123 
<211> 583 
<212> DNA 

<213> Hepatitis C virus 



<400> 123 

ggacaactcc tctccaccag 
caccggcagc ggtaagagca 
gttggtgctc aacccctctg 
ccatggggtt gatcctaata 
cacgtactcc acctacggca 
cataataatt tgtgacgagt 
tgtccttgac caagcagaga 
tccgggctcc gtcactgtgt 
agagatcccc ttttacggca 
catcttctgc cactcaaaga 



cagtgcccca gagcttccag 
ccaaggtccc ggctgcgtac 
ttgctgcaac gctgggcttt 
tcaggaccgg ggtgagaaca 
agttccttgc cgacggcggg 
gccactccac ggatgccaca 
ctgcgggggc gagactggtt 
cccatcctaa catcgaggag 
aggctatccc cctcgaggtg 
agaagtgcga cgagctcgcc 



gtggcccacc tgcatgctcc 60 
gcagcccagg gctacaaggt 120 
ggtgcttaca tgtccaaggc 180 
attaccactg gcagccccat 240 
tgctcaggag gtgcttatga 300 
tccatcttgg gcatcggcac 360 
gtgctcgcca ctgctacccc 420 
gttgctctgt ccaccaccgg 480 
atcaaggggg gaagacatct 54 0 
gcg 583 



<210> 124 
<211> 617 
<212> DNA 

<213> Hepatitis C virus 
<400> 124 

ccttgccgac ggcgggtgct caggaggtgc 
ctccacggat gccacatcca tcttgggcat 
erggggcgaga ctggttgtgc tcgccactgc 
tcctaacatc gaggaggttg ctctgtccac 
tatccccctc gaggtgatca aggggggaag 
gtgcgacgag ctcgccgcga agctggtcgc 
cggtcttgac gtgtctgtca tcccgaccag 
tctcatgact ggctttaccg gcgacttcga 
tcagacagtc gatttcagcc ttgaccctac 
ggatgctgtc tccaggactc aacgccgggg 
tagatttgtg gcaccgg 



ttatgacata ataatttgtg acgagtgcca 60 
cggcactgtc cttgaccaag cagagactgc 120 
tacccctccg ggctccgtca ctgtgtccca 18 0 
caccggagag atcccctttt acggcaaggc 240 
acatctcatc ttctgccact caaagaagaa 3 00 
attgggcatc aatgccgtgg cctactaccg 360 
cggcgatgtt gtcgtcgtgt cgaccgatgc 42 0 
ctctgtgata gactgcaaca cgtgtgtcac 48 0 
ctttaccatt gagacaacca cgctccccca 540 
caggactggc agggggaagc caggcatcta 600 

617 



<210> 125 
<211> 559 



man 
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<212> DNA 

<213> Hepatitis C virus 
<400> 125 

ctccacggat gccacatcca tcttgggcat 
gggggcgaga ctggttgtgc tcgccactgc 
tcctaacatc gaggaggttg ctctgtccac 
tatccccctc gaggtgatca aggggggaag 
gtgcgacgag ctcgccgcga agctggtcgc 
cggtcttgac gtgtctgtca tcccgaccag 
tctcatgact ggctttaccg gcgacttcga 
tcagacagtc gatttcagcc ttgaccctac 
ggatgctgtc tccaggactc aacgccgggg 
tagatttgtg gcaccgggg 



cggcactgtc cttgaccaag cagagactgc 60 
tacccctccg ggctccgtca ctgtgtccca 12 0 
caccggagag atcccctttt acggcaaggc 180 
acatctcatc ttctgccact caaagaagaa 240 
attgggcatc aatgccgtgg cctactaccg 300 
cggcgatgtt gtcgtcgtgt cgaccgatgc 3 60 
ctctgtgata gactgcaaca cgtgtgtcac 42 0 
ctttaccatt gagacaacca cgctccccca 480 
caggactggc agggggaagc caggcatcta 540 

559 



<210> 126 
<211> 475 
<212> DNA 

<213> Hepatitis C virus 
<400> 126 

tgtgatagac tgcaacacgt gtgtcactca 
taccattgag acaaccacgc tcccccagga 
gactggcagg gggaagccag gcatctatag 
catgttcgac tcgtccgtcc tctgtgagtg 
cacgcccgcc gagactacag ttaggctacg 
gtgccaggac catcttgaat tttgggaggg 
ccacttttta tcccagacaa agcagagtgg 
agccaccgtg tgcgctaggg ctcaagcccc 



gacagtcgat ttcagccttg accctacctt 60 
tgctgtctcc aggactcaac gccggggcag 12 0 
atttgtggca ccgggggagc gcccctccgg 18 0 
ctatgacgcg ggctgtgctt ggtatgagct 240 
agcgtacatg aacaccccgg ggcttcccgt 3 00 
cgtctttacg ggcctcactc atatagatgc 3 60 
ggagaacttt ccttacctgg tagcgtacca 420 
tcccccatcg tgggaccaga tgtgg 475 



<210> 127 
<211> 390 
<212> DNA 

<213> Hepatitis C virus 
<400> 127 

tagatttgtg gcaccggggg agcgcccctc 
gtgctatgac gcgggctgtg cttggtatga 
acgagcgtac atgaacaccc cggggcttcc 
gggcgtcttt acgggcctca ctcatataga 
tggggagaac tttccttacc tggtagcgta 
ccctccccca tcgtgggacc agatgtggaa 
tgggccaaca cccctgctat acagactggg 



cggcatgttc gactcgtccg tcgtctgtga 60 
gctcacgccc gccgagacta cagttaggct 120 
cgtgtgccag gaccatcttg aattttggga 18 0 
tgcccacttt ttatcccaga caaagcagag 240 
ccaagccacc gtgtgcgcta gggctcaagc 3 00 
gtgtttgatc cgccttaaac ccaccctcca 360 

390 



<210> 128 
<211> 155 
<212> DNA 

<213> Hepatitis C virus 
<400> 128 

acgagcacct gggtgctcgt tggcggcgtc ctggctgctc tggccgcgta ttgcctgtca 6 0 



mmFmm 
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acaggctgcg tggtcatagt gggcaggatc gtcttgtccg ggaagccggc aattatacct 12 0 
gacagggagg ttctctacca ggagttcgat gagat 155 



<210> 129 
<211> 56 
<212> DNA 

<213> Hepatitis C virus 
<400> 129 

ggctgctctg gccgcgtatt gcctgtcaac aggctgcgtg gtcatagtgg gcagga 56 



<210> 130 
<211> 625 
<212> DNA 

<213> Hepatitis C virus 
<400> 130 

ttttacagct gccgtcacca gcccactaac cactggccaa accctcctct tcaacatatt 60 

gggggggtgg gtggctgccc agctcgccgc ccccggtgcc gctactgcct ttgtgggtgc 120 

tggcctagct ggcgccgcca tcggcagcgt tggactgggg aaggtcctcg tggacattct 180 

tgcagggtat ggcgcgggcg tggcgggagc tcttgtagca ttcaagatca tgagcggtga 240 

ggtcccctcc acggaggacc tggtcaatct gctgcccgcc atcctctcgc ctggagccct 300 

tgtagtcggt gtggtctgcg cagcaatact gcgccggcac gttggcccgg gcgagggggc 360 

agtgcaatgg atgaaccggc taatagcctt cgcctcccgg gggaaccatg tttcccccac 42 0 

gcactacgtg ccggagagcg atgcagccgc ccgcgtcact gccatactca gcagcctcac 480 

tgtaacccag ctcctgaggc gactgcatca gtggataagc tcggagtgta ccactccatg 54 0 

ctccggttcc tggctaaggg acatctggga ctggatatgc gaggtgctga gcgactttaa 600 

gacctggctg aaagccaagc tcatg 625 



<210> 131 
<211> 623 
<212> DNA 

<213> Hepatitis C virus 
<400> 131 

tactgccttt gtgggtgctg gcctagctgg cgccgccatc ggcagcgttg gactggggaa 60 
ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagctc ttgtagcatt 120 
caaga tcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc tgcccgccat 180 
cctctcgcct ggagcccttg tagtcggtgt ggtctgcgca gcaatactgc gccggcacgt 24 0 
tggcccgggc gagggggcag tgcaatggat gaaccggcta atagccttcg cctcccgggg 3 00 
gaaccatgtt tcccccacgc actacgtgcc ggagagcgat gcagccgccc gcgtcactgc 360 
catactcagc agcctcactg taacccagct cctgaggcga ctgcatcagt ggataagctc 420 
ggagtgtacc actccatgct ccggttcctg gctaagggac atctgggact ggatatgcga 480 
ggtgctgagc gactttaaga cctggctgaa agccaagctc atgccacaac tgcctgggat 540 
tccctttgtg tcctgccagc gcgggtatag gggggtctgg cgaggagacg gcattatgca 600 
cactcgctgc cactgtggag ctg 623 



<210> 132 
<211> 678 
<212> DNA 



55 



<213> Hepatitis C virus 



<400> 132 

cctcgtggac 

gatcatgagc 

ctcgcctgga 

cccgggcgag 

ccatgtttcc 

actcagcagc 

gtgtaccact 

gctgagcgac 

ctttgtgtcc 

tcgctgccac 

cggtcctagg 

gggcccctgt 



attcttgcag 
ggtgaggtcc 
gcccttgtag 
ggggcagtgc 
cccacgcact 
ctcactgtaa 
ccatgctccg 
tttaagacct 
tgccagcgcg 
tgtggagctg 
acctgcagga 
actcccct 



ggtatggcgc 
cctccacgga 
tcggtgtggt 
aatggatgaa 
acgtgccgga 
cccagctcct 
gttcctggct 
ggctgaaagc 
ggtatagggg 
agatcactgg 
acatgtggag 



gggcgtggcg 
ggacctggtc 
ctgcgcagca 
ccggctaata 
gagcgatgca 
gaggcgactg 
aagggacatc 
caagctcatg 
ggtctggcga 
acatgtcaaa 

tgggacgttc 



ggagctcttg 
aatctgctgc 
atactgcgcc 
gccttcgcct 
gccgcccgcg 
catcagtgga 
tgggactgga 
ccacaactgc 
ggagacggca 
aacgggacga 
cccattaacg 



tagcattcaa 
ccgccatcct 
ggcacgttgg 
cccgggggaa 
tcactgccat 
taagctcgga 
tatgcgaggt 
ctgggattcc 
ttatgcacac 
tgaggatcgt 
cctacaccac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

678 



<210> 133 
<211> 436 
<212> DNA 

<213> Hepatitis C virus 
<400> 133 

tgcagggtat ggcgcgggcg tggcgggagc 
ggtcccctcc acggaggacc tggtcaatct 
tgtagtcggt gtggtctgcg cagcaatact 
agtgcaatgg atgaaccggc taatagcctt 
gcactacgtg ccggagagcg atgcagccgc 
tgtaacccag ctcctgaggc gactgcatca 
ctccggttcc tggctaaggg acatctggga 
gacctggctg aaagcc 



tcttgtagca ttcaagatca tgagcggtga 60 
gctgcccgcc atcctctcgc ctggagccct 120 
gcgccggcac gttggcccgg gcgagggggc 180 
cgcctcccgg gggaaccatg tttcccccac 240 
ccgcgtcact gccatactca gcagcctcac 30 0 
gtggataagc tcggagtgta ccactccatg 360 
ctggatatgc gaggtgctga gcgactttaa 420 

436 



<210> 134 
<211> 164 
<212> DNA 

<213> Hepatitis C virus 



<400> 134 

agcccttgta gtcggtgtgg tctgcgcagc 
gggggcagtg caatggatga accggctaat 
ccccacgcac tacgtgccgg agagcgatgc 



aatactgcgc cggcacgttg gcccgggcga 60 
agccttcgcc tcccggggga accatgtttc 120 
agccgcccgc gtca 164 



<210> 135 
<211> 496 
<212> DNA 

<213> Hepatitis C virus 
<400> 135 

cgcctcccgg gggaaccatg tttcccccac 
ccgcgtcact gccatactca gcagcctcac 
gtggataagc tcggagtgta ccactccatg 
ctggatatgc gaggtgctga gcgactttaa 



gcactacgtg ccggagagcg atgcagccgc 60 
tgtaacccag ctcctgaggc gactgcatca 12 0 
ctccggttcc tggctaaggg acatctggga 180 
gacctggctg aaagccaagc tcatgccaca 240 



56 



actgcctggg attccctttg tgtcctgcca 
cggcattatg cacactcgct gccactgtgg 
gacgatgagg atcgtcggtc ctaggacctg 
taacgcctac accacgggcc cctgtactcc 
gtggagggtg tctgca 



gcgcgggtat aggggggtct ggcgaggaga 3 00 
agctgagatc actggacatg tcaaaaacgg 3 60 
caggaacatg tggagtggga cgttccccat 42 0 
ccttcctgcg ccgaactata agttcgcgct 480 

496 



<210> 136 
<211> 926 
<212> DNA 

<213> Hepatitis C virus 
<400> 136 

tacgtgccgg agagcgatgc agccgcccgc gtcactgcca tactcagcag cctcactgta 60 

acccagctcc tgaggcgact gcatcagtgg ataagctcgg agtgtaccac tccatgctcc 12 0 

ggttcctggc taagggacat ctgggactgg atatgcgagg tgctgagcga ctttaagacc 18 0 

tggctgaaag ccaagctcat gccacaactg cctgggattc cctttgtgtc ctgccagcgc 240 

gggtataggg gggtctggcg aggagacggc attatgcaca ctcgctgcca ctgtggagct 3 00 

gagatcactg gacatgtcaa aaacgggacg atgaggatcg tcggtcctag gacctgcagg 360 

aacatgtgga gtgggacgtt ccccattaac gcctacacca cgggcccctg tactcccctt 42 0 

cctgcgccga actataagtt cgcgctgtgg agggtgtctg cagaggaata cgtggagata 480 

aggcgggtgg gggacttcca ctacgtatcg ggtatgacta ctgacaatct taaatgcccg 540 

tgccagatcc catcgcccga atttttcaca gaattggacg gggtgcgcct acacaggttt 600 

gcgccccctt gcaagccctt gctgcgggag gaggtatcat tcagagtagg actccacgag 660 

tacccggtgg ggtcgcaatt accttgcgag cccgaaccgg acgtagccgt gttgacgtcc 72 0 

atgctcactg atccctccca tataacagca gaggcggccg ggagaaggtt ggcgagaggg 78 0 

tcaccccctt ctatggccag ctcctcggct agccagctgt ccgctccatc tctcaaggca 84 0 

acttgcaccg ccaaccatga ctcccctgac gccgagctca tagaggctaa cctcctgtgg 90 0 
aggcaggaga tgggcggcaa catcac 92 6 



<210> 137 
<211> 850 
<212> DNA 

<213> Hepatitis C virus 
<400> 137 

actcagcagc ctcactgtaa cccagctcct gaggcgactg catcagtgga taagctcgga 6 0 
gtgtaccact ccatgctccg gttcctggct aagggacatc tgggactgga tatgcgaggt 12 0 
gctgagcgac tttaagacct ggctgaaagc caagctcatg ccacaactgc ctgggattcc 18 0 
ctttgtgtcc tgccagcgcg ggtatagggg ggtctggcga ggagacggca ttatgcacac 24 0 
tcgctgccac tgtggagctg agatcactgg acatgtcaaa aacgggacga tgaggatcgt 300 
cggtcctagg acctgcagga acatgtggag tgggacgttc cccattaacg cctacaccac 36 0 
gggcccctgt actccccttc ctgcgccgaa ctataagttc gcgctgtgga gggtgtctgc 42 0 
agaggaatac gtggagataa ggcgggtggg ggacttccac tacgtatcgg gtatgactac 48 0 
tgacaatctt aaatgcccgt gccagatccc atcgcccgaa tttttcacag aattggacgg 540 
ggtgcgccta cacaggtttg cgcccccttg caagcccttg ctgcgggagg aggtatcatt 600 
cagagtagga ctccacgagt acccggtggg gtcgcaatta ccttgcgagc ccgaaccgga 660 
cgtagccgtg ttgacgtcca tgctcactga tccctcccat ataacagcag aggcggccgg 72 0 
gagaaggttg gcgagagggt cacccccttc tatggccagc tcctcggcta gccagctgtc 780 
cgctccatct ctcaaggcaa cttgcaccgc caaccatgac tcccctgacg ccgagctcat 840 
agaggctaac 8 50 




<210> 138 
<211> 749 
<212> DNA 

<213> Hepatitis C virus 
<400> 138 

cagcctcact gtaacccagc tcctgaggcg actgcatcag tggataagct cggagtgtac 60 
cactccatgc tccggttcct ggctaaggga catctgggac tggatatgcg aggtgctgag 12 0 
cgactttaag acctggctga aagccaagct catgccacaa ctgcctggga ttccctttgt 180 
gtcctgccag cgcgggtata ggggggtctg gcgaggagac ggcattatgc acactcgctg 240 
ccactgtgga gctgagatca ctggacatgt caaaaacggg acgatgagga tcgtcggtcc 3 00 
taggacctgc aggaacatgt ggagtgggac gttccccatt aacgcctaca ccacgggccc 3 60 
ctgtactccc cttcctgcgc cgaactataa gttcgcgctg tggagggtgt ctgcagagga 42 0 
atacgtggag ataaggcggg tgggggactt ccactacgta tcgggtatga ctactgacaa 480 
tcttaaatgc ccgtgccaga tcccatcgcc cgaatttttc acagaattgg acggggtgcg 540 
cctacacagg tttgcgcccc cttgcaagcc cttgctgcgg gaggaggtat cattcagagt 600 
aggactccac gagtacccgg tggggtcgca attaccttgc gagcccgaac cggacgtagc 660 
cgtgttgacg tccatgctca ctgatccctc ccatataaca gcagaggcgg ccgggagaag 720 
gttggcgaga gggtcacccc cttctatgg 74 9 



<210> 139 
<211> 257 
<212> DNA 

<213> Hepatitis C virus 
<400> 139 

gacctggctg aaagccaagc tcatgccaca 
gcgcgggtat aggggggtct ggcgaggaga 
agctgagatc actggacatg tcaaaaacgg 
caggaacatg tggagtggga cgttccccat 
ccttcctgcg ccgaact 



actgcctggg attccctttg tgtcctgcca 60 
cggcattatg cacactcgct gccactgtgg 12 0 
gacgatgagg atcgtcggtc ctaggacctg 18 0 
taacgcctac accacgggcc cctgtactcc 24 0 

257 



<210> 140 
<211> 285 
<212> DNA 

<213> Hepatitis C virus 
<400> 140 

tgagatcact ggacatgtca aaaacgggac 
gaacatgtgg agtgggacgt tccccattaa 
tcctgcgccg aactataagt tcgcgctgtg 
aaggcgggtg ggggacttcc actacgtatc 
gtgccagatc ccatcgcccg aatttttcac 



gatgaggatc gtcggtccta ggacctgcag 60 
cgcctacacc acgggcccct gtactcccct 120 
gagggtgtct gcagaggaat acgtggagat 180 
gggtatgact actgacaatc ttaaatgccc 240 
agaattggac ggggt 285 



<210> 141 
<211> 228 
<212> DNA 

<213> Hepatitis C virus 
<400> 141 

catagaggct aacctcctgt ggaggcagga gatgggcggc aacatcacca gggttgagtc 60 



58 



bbbhwh 



agagaacaaa gtggtgattc tggactcctt cgatccgctt gtggcagagg aggatgagcg 12 0 
ggaggtctcc gtacctgcag aaattctgcg gaagtctcgg agattcgccc gggccctgcc 180 
cgtctgggcg cggccggact acaacccccc gctagtagag acgtggaa 228 



<210> 142 
<211> 273 
<212> DNA 

<213> Hepatitis C virus 
<400> 142 

ccatggctgc ccgctaccac ctccacggtc 
tacggtggtc ctcaccgaat caaccctatc 
ttttggcagc tcctcaactt ccggcattac 
cgccccttct ggctgccccc ccgactccga 
ggagggwag cctggggatc cggatctcag 



ccctcctgtg cctccgcctc ggaaaaagcg 60 
tactgccttg gccgagcttg ccaccaaaag 120 
gggcgacaat acgacaacat cctctgagcc 180 
cgttgagtcc tattcttcca tgccccccct 240 
cga 2 73 



<210> 143 
<211> 412 
<212> DNA 

<213> Hepatitis C virus 
<400> 143 

ttcctggaca ggcgcactcg tcaccccgtg cgctgcggaa gaacaaaaac tgcccatcaa 60 

cgcactgagc aactcgttgc tacgccatca caatctggtg tattccacca cttcacgcag 120 

tgcttgccaa aggcagaaga aagtcacatt tgacagactg caagttctgg acagccatta 180 

ccaggacgtg ctcaaggagg tcaaagcagc ggcgtcaaaa gtgaaggcta acttgctatc 240 

cgtagaggaa gcttgcagcc tgacgccccc acattcagcc aaatccaagt ttggctatgg 3 00 

ggcaaaagac gtccgttgcc atgccagaaa ggccgtagcc cacatcaact ccgtgtggaa 3 60 

agaccttctg gaagacagtg taacaccaat agacactacc atcatggcca ag 412 



<210> 144 
<211> 903 
<212> DNA 

<213> Hepatitis C virus 
<400> 144 

ggctaacttg ctatccgtag aggaagcttg cagcctgacg cccccacatt cagccaaatc 60 
caagtttggc tatggggcaa aagacgtccg ttgccatgcc agaaaggccg tagcccacat 120 
caactccgtg tggaaagacc ttctggaaga cagtgtaaca ccaatagaca ctaccatcat 180 
ggccaagaac gaggttttct gcgttcagcc tgagaagggg ggtcgtaagc cagctcgtct 240 
catcgtgttc cccgacctgg gcgtgcgcgt gtgcgagaag atggccctgt acgacgtggt 3 00 
tagcaagctc cccctggccg tgatgggaag ctcctacgga ttccaatact caccaggaca 360 
gcgggttgaa ttcctcgtgc aagcgtggaa gtccaagaag accccgatgg ggttctcgta 420 
tgatacccgc tgttttgact ccacagtcac tgagagcgac atccgtacgg aggaggcaat 4 80 
ttaccaatgt tgtgacctgg acccccaagc ccgcgtggcc atcaagtccc tcactgagag 540 
gctttatgtt gggggccctc ttaccaattc aaggggggaa aactgcggct accgcaggtg 600 
ccgcgcgagc ggcgtactga caactagctg tggtaacacc ctcacttgct acatcaaggc 660 
ccgggcagcc tgtcgagccg cagggctcca ggactgcacc atgctcgtgt gtggcgacga 720 
cttagtcgtt atctgtgaaa gtgcgggggt ccaggaggac gcggcgagcc tgagagcctt 780 
cacggaggct atgaccaggt actccgcccc ccccggggac cccccacaac cagaatacga 840 
cttggagctt ataacatcat gctcctccaa cgtgtcagtc gcccacgacg gcgctggaaa 900 



bmbbwbb 



59 



gag 903 



<210> 145 
<211> 600 
<212> DNA 

<213> Hepatitis C virus 
<400> 145 

agaggaagct tgcagcctga cgcccccaca ttcagccaaa tccaagtttg gctatggggc 60 
aaaagacgtc cgttgccatg ccagaaaggc cgtagcccac atcaactccg tgtggaaaga 120 
ccttctggaa gacagtgtaa caccaataga cactaccatc atggccaaga acgaggtttt 18 0 
ctgcgttcag cctgagaagg ggggtcgtaa gccagctcgt ctcatcgtgt tccccgacct 240 
gggcgtgcgc gtgtgcgaga' agatggccct gtacgacgtg gttagcaagc tccccctggc 300 
cgtgatggga agctcctacg gattccaata ctcaccagga cagcgggttg aattcctcgt 360 
gcaagcgtgg aagtccaaga agaccccgat ggggttctcg tatgataccc gctgttttga 42 0 
ctccacagtc actgagagcg acatccgtac ggaggaggca atttaccaat gttgtgacct 480 
ggacccccaa gcccgcgtgg ccatcaagtc cctcactgag aggctttatg ttgggggccc 540 
tcttaccaat tcaagggggg aaaactgcgg ctaccgcagg tgccgcgcga gcggcgtact 6 00 



<210> 146 
<211> 781 
<212> DNA 

<213> Hepatitis C virus 
<400> 146 

ccttctggaa gacagtgtaa caccaataga cactaccatc atggccaaga acgaggtttt 60 
ctgcgttcag cctgagaagg ggggtcgtaa gccagctcgt ctcatcgtgt tccccgacct 120 
gggcgtgcgc gtgtgcgaga agatggccct gtacgacgtg gttagcaagc tccccctggc 180 
cgtgatggga agctcctacg gattccaata ctcaccagga cagcgggttg aattcctcgt 240 
gcaagcgtgg aagtccaaga agaccccgat ggggttctcg tatgataccc gctgttttga 300 
ctccacagtc actgagagcg acatccgtac ggaggaggca atttaccaat gttgtgacct 360 
ggacccccaa gcccgcgtgg ccatcaagtc cctcactgag aggctttatg ttgggggccc 42 0 
tcttaccaat tcaagggggg aaaactgcgg ctaccgcagg tgccgcgcga gcggcgtact 480 
gacaactagc tgtggtaaca ccctcacttg ctacatcaag gcccgggcag cctgtcgagc 540 
cgcagggctc caggactgca ccatgctcgt gtgtggcgac gacttagtcg ttatctgtga 600 
aagtgcgggg gtccaggagg acgcggcgag cctgagagcc ttcacggagg ctatgaccag 660 
gtactccgcc ccccccgggg accccccaca accagaatac gacttggagc ttataacatc 720 
atgctcctcc aacgtgtcag tcgcccacga cggcgctgga aagagggtct actaccttac 780 
c 781 



<210> 147 
<211> 382 
<212> DNA 

<213> Hepatitis C virus 
<400> 147 

cgttatctgt gaaagtgcgg gggtccagga ggacgcggcg agcctgagag ccttcacgga 60 
ggctatgacc aggtactccg ccccccccgg ggacccccca caaccagaat acgacttgga 12 0 
gcttataaca tcatgctcct ccaacgtgtc agtcgcccac gacggcgctg gaaagagggt 180 
ctactacctt acccgtgacc ctacaacccc cctcgcgaga gccgcgtggg agacagcaag 240 
acacactcca gtcaattcct ggctaggcaa cataatcatg tttgccccca cactgtgggc 300 



60 



gaggatgata ctgatgaccc atttctttag cgtcctcata gccagggatc agcttgaaca 360 
ggctcttaac tgtgagatct ac 382 



<210> 148 
<211> 268 
<212> DNA 

<213> Hepatitis C virus 
<400> 148 

cgtgtcagtc gcccacgacg gcgctggaaa gagggtctac taccttaccc gtgaccctac 60 
aacccccctc gcgagagccg cgtgggagac agcaagacac actccagtca attcctggct 120 
aggcaacata atcatgtttg cccccacact gtgggcgagg atgatactga tgacccattt 18 0 
ctttagcgtc ctcatagcca gggatcagct tgaacaggct cttaactgtg agatctacgg 240 
agcctgctac tccatagaac cactggat 268 



<210> 149 
<211> 222 
<212> DNA 

<213> Hepatitis C virus 
<400> 149 

actccatggc ctcagcgcat tttcactcca cagttactct ccaggtgaaa tcaatagggt 60 
ggccgcatgc ctcagaaaac ttggggtccc gcccttgcga gcttggagac accgggcccg 12 0 
gagcgtccgc gctaggcttc tgtccagagg aggcagggct gccatatgtg gcaagtacct 180 
cttcaactgg gcagtaagaa caaagctcaa actcactcca at 222 



<210> 150 
<211> 192 
<212> DNA 

<213> Hepatitis C virus 
<400> 150 

ctctccaggt gaaatcaata gggtggccgc atgcctcaga aaacttgggg tcccgccctt 60 
gcgagcttgg agacaccggg cccggagcgt ccgcgctagg cttctgtcca gaggaggcag 120 
ggctgccata tgtggcaagt acctcttcaa ctgggcagta agaacaaagc tcaaactcac 180 
tccaatagcg gc 192 



<210> 151 
<211> 10 * 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: linker 
sequence 



<400> 151 
gggccacgaa 



10 
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<210> 152 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: linker 
sequence 

<400> 152 
ttcgtggccc ctg 



<210> 153 
<211> 138 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pP6 vector 
sequence 

<400> 153 

ctagccatgg ccgcaggggc cgcggccgca ctagtgggga tccttaatta aagggccact 60 
ggggcccccc gtaccggcgt ccccggcgcc ggcgtgatca cccctaggaa ttaatttccc 120 
ggtgaccccg ggggagct 138 



<210> 154 
<211> 128 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: pB5 vector 
sequence 

<400> 154 

catggccgca ggggccgcgg ccgcactagt ggggatcctt aattaaaggg ccactggggc 60 
cccccggcgt ccccggcgcc ggcgtgatca cccctaggaa ttaatttccc ggtgaccccg 120 

ggggagct 128 



<210> 155 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial Sequence: primer 
<400> 155 

gcgtttggaa tcactacagg 20 



Printed:08-1 1-2001 
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00402225 
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<210> 156 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 



<400> 156 

cacgatgcac gttgaagtg 



19 



62 
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